librumpuser

Robert Elz Tue, 24 Mar 2020 23:15:32 -0700

    Date:        Tue, 24 Mar 2020 19:37:02 +0100
    From:        Kamil Rytarowski <[email protected]>
    Message-ID:  <[email protected]>


  | ASan detects not a hypothetical, but factual momory corruption.

Yes, I know that - and I believed you from the start that there was a
buffer overrun there.

The issue is whether the offending line was useful for anything or not.

I am currently running ATF tests with that line simply deleted (actually,
replaced by a check that there is a \0 in the buffer somewhere already,
and abort() if not).

I'm expecting, for our ATF tests, that this is going to work fine (not abort),
which is why your "fix" looked correct - it certainly avoided the buffer
overrun, and was simply writing a \0 either on top of one that already was
present in the same byte (I am going to do another check to see if this was
the case next, but your asan output from this message suggests probably not)
or after an earlier \0 somewhere previously in the buffer (ie: in empty
unused space beyond the end of the string (which the asan output suggests
is the more likely case)).

Note that knowing that this is true of our tests, while useful info, proves
nothing - different data might stress the code in different ways, hence I
am going to find where (everywhere) the data is first constructed, and check
that it always guarantees \0 termination.   If there is, then the total
fix for the ASAN problem is to delete the offending line.   If not, the
correct fix is to make sure that the data is always correctly \0 terminated
-- and then delete the offending line.    That's what I mean about "wait for
the real problem" - we need to find out whether the line that wrote beyond
the buffer end was there merely as a "I am going to treat this data like a
string, and bad things will happen if it isn't \0 terminated somehow, so
I will just stick a \0 after the buffer, just in case" type protection
mechanism, that was never really needed in the first place (a security
blanket), or whether there is some case where the code, given appropriate
data (say an exec that is MAXPATHLEN long, or something ... this is just
wild speculation of course) where it might currently be possible that no \0
is present, in which case that \0 added was saving things. and your fix was 
corrupting the data, and the correct fix is to make the buffer 1 byte bigger
so the \0 will fit (where the data is originally constructed).

We just don't know yet - and no amount of random testing will tell us (except
by fluke, and even that would just indicate that there is a real problem,
by managing to use data that triggers it, but probably not where in the code
is the base cause), it needs careful code reading.

This is why when the sanitisers find a problem, and it isn't obvious what
is the real cause (as opposed to the actual line that causes the problem)
you should avoid making random "fixes" to make the sanitiser report go away
(by no longer doing whatever it is complaining about - but not necessarily
implementing the original algorithm, whatever it was, correctly either).

This is an example of a case like that.   On the other hand, the other change
you committed at about the same time (the one with the iov where there was
an a && b test, where logically both have to be true for the code to be
executed, so it shouldn't matter which order, so programmers tend to put the
more likely to be false first, to save testing the less likely one when the
other is false - but in the case in questionb, when "b" was false, testing
"a" was accessing beyond the end of the memory.   Ie: the test should have
been b && a - and that's what you changted it to.   For that one, the cause
of the issue was obvious, and the fix correct - when you see ones that are
that simple, by all means, just fix them, as you did that time.

But if in doubt, file a PR - sanitiser detected bugs, while (at least often)
real bugs (ubsan not quite so much...) are rarely, if ever, critical fix
type - they have usually been present for years, harming no-one, so taking
a few extra days/weeks/months to arrive at the correct fix isn't really
doing much harm, usually.

kre

Re: CVS commit: src/lib/librumpuser

Reply via email to