Mahesh Jagannath Salgaonkar <mah...@linux.vnet.ibm.com> writes: >> I think we could provide a better interface with instead having a file >> per log message appear in sysfs. We're never going to have more than 128 >> of these at any one time on the Linux side, so it's not going to bee too >> many files. > > It is not just about 128 files, we may be adding/removing sysfs node for > every new log id that gets informed to kernel and ack-ed. In worst case, > when we have flood of elog errors with user daemon consuming it and > ack-ing back to get ready for next log in a tight poll, we may > continuously add/remove the sysfs node for each new <id>.
Do we ever get a storm of hundreds/thousands of them though? If many come it at once userspace may just be woken up one or two times, as it would just select() and wait for events. >> I've seen some conflicting things on this - is it 2kb or 16kb? > > We choose 16kb because we want to pull all the log data and not > partial. So the max log size for any one entry is in fact 16kb? >> This means we constantly use 128 * sizeof(struct opal_err_log) which >> equates to somewhere north of 2MB of memory (due to list overhead). >> >> I don't think we need to statically allocate this, we can probably just >> allocate on-demand as in a typical system you're probably quite >> unlikely to have too many of these sitting around (besides, if for >> whatever reason we cannot allocate memory at some point, that's okay >> because we can read it again later). > > The reason we choose to go for static allocation is, we can not afford > to drop or delay a critical error log due to memory allocation failure. > OR we can keep static allocations for critical errors and follow dynamic > allocation for informative error logs. What do you say? Userspace is probably going to have to do IO to get the log and ack it, so it's probably not a huge problem - if we can't allocate a few kb in a couple of attempts then we likely have bigger problems. If we were going to have a sustained amount of hundreds/thousands of these per second then perhaps we'd have other issues, but from what I understand we're probably only going to have a handful per year on a typical system? (I am, of course, not talking about our dev systems, which are rather atypical :) I'll likely have a patch today that shows kind of what I mean. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev