Hi Hans,

On Fri, Jun 07, 2019 at 04:11:20PM +0200, Hans Verkuil wrote:
> On 6/7/19 3:55 PM, Marek Szyprowski wrote:
> > Hi Hans,
> > 
> > On 2019-06-07 15:40, Hans Verkuil wrote:
> >> On 6/7/19 2:47 PM, Hans Verkuil wrote:
> >>> On 6/7/19 2:23 PM, Hans Verkuil wrote:
> >>>> On 6/7/19 2:14 PM, Marek Szyprowski wrote:
> >>>>> On 2019-06-07 14:01, Hans Verkuil wrote:
> >>>>>> On 6/7/19 1:16 PM, Laurent Pinchart wrote:
> >>>>>>> Thank you for the patch.
> >>>>>>>
> >>>>>>> On Fri, Jun 07, 2019 at 10:45:31AM +0200, Hans Verkuil wrote:
> >>>>>>>> The __prepare_userptr() function made the incorrect assumption that 
> >>>>>>>> if the
> >>>>>>>> same user pointer was used as the last one for which memory was 
> >>>>>>>> acquired, then
> >>>>>>>> there was no need to re-acquire the memory. This assumption was 
> >>>>>>>> never properly
> >>>>>>>> tested, and after doing that it became clear that this was in fact 
> >>>>>>>> wrong.
> >>>>>>> Could you explain in the commit message why the assumption is not
> >>>>>>> correct ?
> >>>>>> You can free the memory, then allocate it again and you can get the 
> >>>>>> same pointer,
> >>>>>> even though it is not necessarily using the same physical pages for 
> >>>>>> the memory
> >>>>>> that the kernel is still using for it.
> >>>>>>
> >>>>>> Worse, you can free the memory, then allocate only half the memory you 
> >>>>>> need and
> >>>>>> get back the same pointer. vb2 wouldn't notice this. And it seems to 
> >>>>>> work (since
> >>>>>> the original mapping still remains), but this can corrupt userspace 
> >>>>>> memory
> >>>>>> causing the application to crash. It's not quite clear to me how the 
> >>>>>> memory can
> >>>>>> get corrupted. I don't know enough of those low-level mm internals to 
> >>>>>> understand
> >>>>>> the sequence of events.
> >>>>>>
> >>>>>> I have test code for v4l2-compliance available if someone wants to 
> >>>>>> test this.
> >>>>> I'm interested, I would really like to know what happens in the mm
> >>>>> subsystem in such case.
> >>>> Here it is:
> >>>>
> >>>> diff --git a/utils/v4l2-compliance/v4l2-test-buffers.cpp 
> >>>> b/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>>> index be606e48..9abf41da 100644
> >>>> --- a/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>>> +++ b/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>>> @@ -797,7 +797,7 @@ int testReadWrite(struct node *node)
> >>>>          return 0;
> >>>>   }
> >>>>
> >>>> -static int captureBufs(struct node *node, const cv4l_queue &q,
> >>>> +static int captureBufs(struct node *node, cv4l_queue &q,
> >>>>                  const cv4l_queue &m2m_q, unsigned frame_count, int 
> >>>> pollmode,
> >>>>                  unsigned &capture_count)
> >>>>   {
> >>>> @@ -962,6 +962,21 @@ static int captureBufs(struct node *node, const 
> >>>> cv4l_queue &q,
> >>>>                                  buf.s_flags(V4L2_BUF_FLAG_REQUEST_FD);
> >>>>                                  buf.s_request_fd(buf_req_fds[req_idx]);
> >>>>                          }
> >>>> +                        if (v4l_type_is_capture(buf.g_type()) && 
> >>>> q.g_memory() == V4L2_MEMORY_USERPTR) {
> >>>> +                                printf("\nidx: %d", buf.g_index());
> >>>> +                                for (unsigned p = 0; p < 
> >>>> q.g_num_planes(); p++) {
> >>>> +                                        printf(" old buf[%d]: %p ", p, 
> >>>> buf.g_userptr(p));
> >>>> +                                        fflush(stdout);
> >>>> +                                        free(buf.g_userptr(p));
> >>>> +                                        void *m = calloc(1, 
> >>>> q.g_length(p)/2);
> >>>> +
> >>>> +                                        fail_on_test(m == NULL);
> >>>> +                                        q.s_userptr(buf.g_index(), p, 
> >>>> m);
> >>>> +                                        printf("new buf[%d]: %p", p, m);
> >>>> +                                        buf.s_userptr(m, p);
> >>>> +                                }
> >>>> +                                printf("\n");
> >>>> +                        }
> >>>>                          fail_on_test(buf.qbuf(node, q));
> >>>>                          fail_on_test(buf.g_flags() & 
> >>>> V4L2_BUF_FLAG_DONE);
> >>>>                          if (buf.g_flags() & V4L2_BUF_FLAG_REQUEST_FD) {
> >>>>
> >>>>
> >>>>
> >>>> Load the vivid driver and just run 'v4l2-compliance -s10' and you'll see:
> >>>>
> >>>> ...
> >>>> Streaming ioctls:
> >>>>          test read/write: OK
> >>>>          test blocking wait: OK
> >>>>          test MMAP (no poll): OK
> >>>>          test MMAP (select): OK
> >>>>          test MMAP (epoll): OK
> >>>>          Video Capture: Frame #000
> >>>> idx: 0 old buf[0]: 0x7f71c6e7c010 new buf[0]: 0x7f71c6eb4010
> >>>>          Video Capture: Frame #001
> >>>> idx: 1 old buf[0]: 0x7f71c6e0b010 new buf[0]: 0x7f71c6e7b010
> >>>>          Video Capture: Frame #002
> >>>> idx: 0 old buf[0]: 0x7f71c6eb4010 free(): invalid pointer
> >>>> Aborted
> >>> To clarify: two full size buffers are allocated and queued (that happens 
> >>> in setupUserPtr()),
> >>> then streaming starts and captureBufs is called which basically just 
> >>> calls dqbuf
> >>> and qbuf.
> >>>
> >>> Tomasz pointed out that all the pointers in this log are actually 
> >>> different. That's
> >>> correct, but here is a log where the old and new buf ptr are the same:
> >>>
> >>> Streaming ioctls:
> >>>          test read/write: OK
> >>>          test blocking wait: OK
> >>>          test MMAP (no poll): OK
> >>>          test MMAP (select): OK
> >>>          test MMAP (epoll): OK
> >>>          Video Capture: Frame #000
> >>> idx: 0 old buf[0]: 0x7f1094e16010 new buf[0]: 0x7f1094e4e010
> >>>          Video Capture: Frame #001
> >>> idx: 1 old buf[0]: 0x7f1094da5010 new buf[0]: 0x7f1094e15010
> >>>          Video Capture: Frame #002
> >>> idx: 0 old buf[0]: 0x7f1094e4e010 new buf[0]: 0x7f1094e4e010
> >>>          Video Capture: Frame #003
> >>> idx: 1 old buf[0]: 0x7f1094e15010 free(): invalid pointer
> >>> Aborted
> >>>
> >>> It's weird that the first log fails that way: if the pointers are 
> >>> different,
> >>> then vb2 will call get_userptr and it should discover that the buffer 
> >>> isn't
> >>> large enough, causing qbuf to fail. That doesn't seem to happen.
> >> I think that the reason for this corruption is that the memory pool used
> >> by glibc is now large enough for vb2 to think it can map the full length
> >> of the user pointer into memory, even though only the first half is 
> >> actually
> >> from the buffer that's allocated. When you capture a frame you just 
> >> overwrite
> >> a random part of the application's memory pool, causing this invalid 
> >> pointer.
> >>
> >> But that's a matter of garbage in, garbage out. So that's not the issue 
> >> here.
> >>
> >> The real question is what happens when you free the old buffer, allocate a
> >> new buffer, end up with the same userptr, but it's using one or more 
> >> different
> >> pages for its memory compared to the mapping that the kernel uses.
> >>
> >> I managed to reproduce this with v4l2-ctl:
> >>
> >> diff --git a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp 
> >> b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> >> index 28b2b3b9..8f2ed9b5 100644
> >> --- a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> >> +++ b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> >> @@ -1422,6 +1422,24 @@ static int do_handle_cap(cv4l_fd &fd, cv4l_queue 
> >> &q, FILE *fout, int *index,
> >>             * has the size that fits the old resolution and might not
> >>             * fit to the new one.
> >>             */
> >> +          if (q.g_memory() == V4L2_MEMORY_USERPTR) {
> >> +                  printf("\nidx: %d", buf.g_index());
> >> +                  for (unsigned p = 0; p < q.g_num_planes(); p++) {
> >> +                          unsigned *pb = (unsigned *)buf.g_userptr(p);
> >> +                          printf(" old buf[%d]: %p first pixel: 0x%x", p, 
> >> buf.g_userptr(p), *pb);
> >> +                          fflush(stdout);
> >> +                          free(buf.g_userptr(p));
> >> +                          void *m = calloc(1, q.g_length(p));
> >> +
> >> +                          if (m == NULL)
> >> +                                  return QUEUE_ERROR;
> >> +                          q.s_userptr(buf.g_index(), p, m);
> >> +                          if (m == buf.g_userptr(p))
> >> +                                  printf(" identical new buf");
> >> +                          buf.s_userptr(m, p);
> >> +                  }
> >> +                  printf("\n");
> >> +          }
> >>            if (fd.qbuf(buf) && errno != EINVAL) {
> >>                    fprintf(stderr, "%s: qbuf error\n", __func__);
> >>                    return QUEUE_ERROR;
> >>
> >>
> >> Load vivid, setup a pure white test pattern:
> >>
> >> v4l2-ctl -c test_pattern=6
> >>
> >> Now run v4l2-ctl --stream-user and you'll see:
> >>
> >> idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x80ea80ea identical new buf
> >> <
> >> idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x80ea80ea identical new buf
> >> <
> >> idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x80ea80ea identical new buf
> >> <
> >> idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x80ea80ea identical new buf
> >> <
> >> idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x0 identical new buf
> >> <
> >> idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x0 identical new buf
> >> < 5.00 fps
> >>
> >> idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x0 identical new buf
> >> <
> >> idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x0 identical new buf
> >>
> >> The first four dequeued buffers are filled with data, after that the
> >> returned buffer is empty because vivid is actually writing to different
> >> memory pages.
> >>
> >> With this patch the first pixel is always non-zero.
> > 
> > Good catch. The question is weather we treat that as undefined behavior 
> > and keep the optimization for 'good applications' or assume that every 
> > broken userspace code has to be properly handled. The good thing is that 
> > there is still imho no security issue. The physical pages gathered by 
> 
> Yeah, that scared me for a bit, but it all looks secure.
> 
> > vb2 in worst case belongs to noone else (vb2 is their last user, they 
> > are not yet returned to free pages pool).
> 
> I see three options:
> 
> 1) just always reacquire the buffer, and if anyone complains about it
>    being slower we point them towards DMABUF.

That doesn't really help right now as DMABUF has the same property: the
pages are mapped and unmapped every time the buffer is touched by a device.

That could be addressed though, it's not an inherent property of DMABUF
buffer type.

> 
> 2) keep the current behavior, but document it.

I'd favour this. Ideally there should be a way to discard a buffer, and
DESTROY_BUF has been proposed before.

> 
> 3) as 2), but also add a new buffer flag that forces a reacquire of the
>    buffer. This could be valid for DMABUF as well. E.g.:
> 
>    V4L2_BUF_FLAG_REACQUIRE
> 
> I'm leaning towards the third option since it won't slow down existing
> implementations, yet if you do change the userptr every time, then you
> can now force this to work safely.

This would get around the problem of reallocating but in a use case
specific way. DESTROY_BUF would be more generic. But I admit it would
require some rework of vb2 codebase, and possibly drivers as well. So I
think this is reasonable compromise.

-- 
Kind regards,

Sakari Ailus
[email protected]

Reply via email to