Ingo Molnar <mi...@kernel.org> writes: > * Ingo Molnar <mi...@kernel.org> wrote: > >> >> * Johannes Berg <johan...@sipsolutions.net> wrote: >> >> > On Mon, 2015-08-24 at 17:32 +0300, Alexander Shishkin wrote: >> > >> > > This time around, I employed a linker trick to convert the structures >> > > containing extended error information into integers, which are then made >> > > to >> > > look just like normal error codes so that IS_ERR_VALUE() and friends >> > > would >> > > still work correctly on them. So no extra pointers in the struct >> > > perf_event >> > > or anywhere else; the extended error codes are passed around like normal >> > > error codes. They only need to be converted in syscalls' topmost return >> > > statements. This is done in 1/6. >> > >> > For the record, as we discussed separately, I'd love to see this move to >> > more >> > general infrastructure. In wireless (nl80211), for example, we have a few >> > hundred (!) callsites returning -EINVAL, mostly based on malformed netlink >> > attributes, and it can be very difficult to figure out what went wrong; >> > debugging mostly employs a variation of Hugh's trick. >> >> Absolutely, I suggested this as well earlier today, as the scheduler would >> like >> to make use of it in syscalls with extensible ABIs, such as sched_setattr(). >> >> If people really like this then we could go farther as well and add a >> standalone >> 'extended errors system call' as well (SyS_errno_extended_get()), which >> would >> allow the recovery of error strings even for system calls that are not >> easily >> extensible. We could cache the last error description in the task struct. > > If we do that then we don't even have to introduce per system call error code > conversion, but could unconditionally save the last extended error info in > the > task struct and continue - this could be done very cheaply with the linker > trick > driven integer ID. > > I.e. system calls could opt in to do: > > return err_str(-EBUSY, "perf/x86: BTS conflicts with active events"); > > and the overhead of this would be minimal, we'd essentially do something like > this > to save the error: > > current->err_code = code; > > where 'code' is a build time constant in essence.
I'd propose a mixed approach here: err_str() would still return an integer in the [-EXT_ERRNO, -MAX_ERRNO] range which would index the err_site struct and upon returning to userspace we'd do current->err_code = code; return ext_errno(code); /* the traditional errno */ Reason: the lifetime of this extended error code would be exactly the same as that of the traditional error value so that we'd always return the most recent error and wouldn't be prone to something overwriting the error code under us. The problem with code checking for different types of errors has two sides to it: * most of those error codes that are check for shouldn't really be annotated at all and should rather remain like they are; * with the ones that actually do need to be checked for, the checks would change from "if (err == EINTR)" to "if (ext_errno(err) == EINTR)", which doesn't seem like a big deal (with ext_errno() being a O(1) lookup). Side note: we should also make sure that only the userspace-visible errors ever get annotated like that to prevent the error message creep (which would be even a bigger problem if we go ahead to store the extended error code in task_struct right at the topmost return statement). Perf example: pretty much all errors that happen around event scheduling, including stuff that pmu callbacks return, needn't and shouldn't be annotated at all. > We could use this even in system calls where the error path is performance > critical, as all the string recovery and copying overhead would be triggered > by > applications that opt in via the new system call: > > struct err_desc { > const char *message; > const char *owner; > const int code; > }; > > SyS_err_get_desc(struct err_desc *err_desc __user); > > [ Which could perhaps be a prctl() extension as well (PR_GET_ERR_DESC): > finally > some truly matching functionality for prctl(). ] > > Hm? I like this. Regards, -- Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/