Hi Helen,

On Mon, Aug 12, 2019 at 08:41:33PM -0300, Helen Koike wrote:
> On 8/12/19 7:14 PM, Shuah Khan wrote:
> > On 8/12/19 1:10 PM, Shuah Khan wrote:
> >> On 8/12/19 12:52 PM, André Almeida wrote:
> >>> On 8/12/19 11:08 AM, Shuah Khan wrote:
> >>>> On 8/9/19 9:51 PM, Helen Koike wrote:
> >>>>> On 8/9/19 9:24 PM, André Almeida wrote:
> >>>>>> On 8/9/19 9:17 PM, Shuah Khan wrote:
> >>>>>>> On 8/9/19 5:52 PM, André Almeida wrote:
> >>>>>>>> On 8/9/19 6:45 PM, Shuah Khan wrote:
> >>>>>>>>> vimc uses Component API to split the driver into functional
> >>>>>>>>> components.
> >>>>>>>>> The real hardware resembles a monolith structure than component and
> >>>>>>>>> component structure added a level of complexity making it hard to
> >>>>>>>>> maintain without adding any real benefit.
> >>>>>>>>>        The sensor is one vimc component that would makes sense to 
> >>>>>>>>> be a
> >>>>>>>>> separate
> >>>>>>>>> module to closely align with the real hardware. It would be easier 
> >>>>>>>>> to
> >>>>>>>>> collapse vimc into single monolithic driver first and then split the
> >>>>>>>>> sensor off as a separate module.
> >>>>>>>>>
> >>>>>>>>> This patch series emoves the component API and makes minimal
> >>>>>>>>> changes to
> >>>>>>>>> the code base preserving the functional division of the code
> >>>>>>>>> structure.
> >>>>>>>>> Preserving the functional structure allows us to split the sensor 
> >>>>>>>>> off
> >>>>>>>>> as a separate module in the future.
> >>>>>>>>>
> >>>>>>>>> Major design elements in this change are:
> >>>>>>>>>        - Use existing struct vimc_ent_config and struct
> >>>>>>>>> vimc_pipeline_config
> >>>>>>>>>          to drive the initialization of the functional components.
> >>>>>>>>>        - Make vimc_ent_config global by moving it to vimc.h
> >>>>>>>>>        - Add two new hooks add and rm to initialize and register,
> >>>>>>>>> unregister
> >>>>>>>>>          and free subdevs.
> >>>>>>>>>        - All component API is now gone and bind and unbind hooks are
> >>>>>>>>> modified
> >>>>>>>>>          to do "add" and "rm" with minimal changes to just add and 
> >>>>>>>>> rm
> >>>>>>>>> subdevs.
> >>>>>>>>>        - vimc-core's bind and unbind are now register and 
> >>>>>>>>> unregister.
> >>>>>>>>>        - vimc-core invokes "add" hooks from its
> >>>>>>>>> vimc_register_devices().
> >>>>>>>>>          The "add" hooks remain the same and register subdevs. They
> >>>>>>>>> don't
> >>>>>>>>>          create platform devices of their own and use vimc's
> >>>>>>>>> pdev.dev as
> >>>>>>>>>          their reference device. The "add" hooks save their
> >>>>>>>>> vimc_ent_device(s)
> >>>>>>>>>          in the corresponding vimc_ent_config.
> >>>>>>>>>        - vimc-core invokes "rm" hooks from its unregister to
> >>>>>>>>> unregister
> >>>>>>>>> subdevs
> >>>>>>>>>          and cleanup.
> >>>>>>>>>        - vimc-core invokes "add" and "rm" hooks with pointer to 
> >>>>>>>>> struct
> >>>>>>>>> vimc_device
> >>>>>>>>>          and the corresponding struct vimc_ent_config pointer.
> >>>>>>>>>        The following configure and stream test works on all devices.
> >>>>>>>>>             media-ctl -d platform:vimc -V '"Sensor
> >>>>>>>>> A":0[fmt:SBGGR8_1X8/640x480]'
> >>>>>>>>>        media-ctl -d platform:vimc -V '"Debayer
> >>>>>>>>> A":0[fmt:SBGGR8_1X8/640x480]'
> >>>>>>>>>        media-ctl -d platform:vimc -V '"Sensor
> >>>>>>>>> B":0[fmt:SBGGR8_1X8/640x480]'
> >>>>>>>>>        media-ctl -d platform:vimc -V '"Debayer
> >>>>>>>>> B":0[fmt:SBGGR8_1X8/640x480]'
> >>>>>>>>>             v4l2-ctl -z platform:vimc -d "RGB/YUV Capture" -v
> >>>>>>>>> width=1920,height=1440
> >>>>>>>>>        v4l2-ctl -z platform:vimc -d "Raw Capture 0" -v
> >>>>>>>>> pixelformat=BA81
> >>>>>>>>>        v4l2-ctl -z platform:vimc -d "Raw Capture 1" -v
> >>>>>>>>> pixelformat=BA81
> >>>>>>>>>             v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video1
> >>>>>>>>>        v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video2
> >>>>>>>>>        v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video3
> >>>>>>>>>
> >>>>>>>>> The third patch in the series fixes a general protection fault found
> >>>>>>>>> when rmmod is done while stream is active.
> >>>>>>>>
> >>>>>>>> I applied your patch on top of media_tree/master and I did some
> >>>>>>>> testing.
> >>>>>>>> Not sure if I did something wrong, but just adding and removing the
> >>>>>>>> module generated a kernel panic:
> >>>>>>>
> >>>>>>> Thanks for testing.
> >>>>>>>
> >>>>>>> Odd. I tested modprobe and rmmod both.I was working on Linux 5.3-rc2.
> >>>>>>> I will apply these to media latest and work from there. I have to
> >>>>>>> rebase these on top of the reverts from Lucas and Helen
> >>>>>>
> >>>>>> Ok, please let me know if I succeeded to reproduce.
> >>>>>>
> >>>>>>>>
> >>>>>>>> ~# modprobe vimc
> >>>>>>>> ~# rmmod vimc
> >>>>>>>> [   16.452974] stack segment: 0000 [#1] SMP PTI
> >>>>>>>> [   16.453688] CPU: 0 PID: 2038 Comm: rmmod Not tainted 5.3.0-rc2+ 
> >>>>>>>> #36
> >>>>>>>> [   16.454678] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> >>>>>>>> BIOS 1.12.0-20181126_142135-anatol 04/01/2014
> >>>>>>>> [   16.456191] RIP: 0010:kfree+0x4d/0x240
> >>>>>>>>
> >>>>>>>> <registers values...>
> >>>>>>>>
> >>>>>>>> [   16.469188] Call Trace:
> >>>>>>>> [   16.469666]  vimc_remove+0x35/0x90 [vimc]
> >>>>>>>> [   16.470436]  platform_drv_remove+0x1f/0x40
> >>>>>>>> [   16.471233]  device_release_driver_internal+0xd3/0x1b0
> >>>>>>>> [   16.472184]  driver_detach+0x37/0x6b
> >>>>>>>> [   16.472882]  bus_remove_driver+0x50/0xc1
> >>>>>>>> [   16.473569]  vimc_exit+0xc/0xca0 [vimc]
> >>>>>>>> [   16.474231]  __x64_sys_delete_module+0x18d/0x240
> >>>>>>>> [   16.475036]  do_syscall_64+0x43/0x110
> >>>>>>>> [   16.475656]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >>>>>>>> [   16.476504] RIP: 0033:0x7fceb8dafa4b
> >>>>>>>>
> >>>>>>>> <registers values...>
> >>>>>>>>
> >>>>>>>> [   16.484853] Modules linked in: vimc(-) videobuf2_vmalloc
> >>>>>>>> videobuf2_memops v4l2_tpg videobuf2_v4l2 videobuf2_common
> >>>>>>>> [   16.486187] ---[ end trace 91e5e0894e254d49 ]---
> >>>>>>>> [   16.486758] RIP: 0010:kfree+0x4d/0x240
> >>>>>>>>
> >>>>>>>> <registers values...>
> >>>>>>>>
> >>>>>>>> fish: “rmmod vimc” terminated by signal SIGSEGV (Address boundary
> >>>>>>>> error)
> >>>>>>>>
> >>>>>>>> I just added the module after booting, no other action was made.
> >>>>>>>> Here is
> >>>>>>>> how my `git log --oneline` looks like:
> >>>>>>>>
> >>>>>>>> 897d708e922b media: vimc: Fix gpf in rmmod path when stream is active
> >>>>>>>> 2e4a5ad8ad6d media: vimc: Collapse component structure into a single
> >>>>>>>> monolithic driver
> >>>>>>>> 7c8da1687e92 media: vimc: move private defines to a common header
> >>>>>>>> 97299a303532 media: Remove dev_err() usage after platform_get_irq()
> >>>>>>>> 25a3d6bac6b9 media: adv7511/cobalt: rename driver name to 
> >>>>>>>> adv7511-v4l2
> >>>>>
> >>>>> I couldn't reproduce the error, my tree looks the same:
> >>>>>
> >>>>> [I] koike@floko ~/m/o/linux> git log --oneline
> >>>>> e3345155c8ed (HEAD) media: vimc: Fix gpf in rmmod path when stream is
> >>>>> active
> >>>>> 43e9e2fe761f media: vimc: Collapse component structure into a single
> >>>>> monolithic driver
> >>>>> 8a6d0b9adde0 media: vimc: move private defines to a common header
> >>>>> 97299a303532 (media/master) media: Remove dev_err() usage after
> >>>>> platform_get_irq()
> >>>>> 25a3d6bac6b9 media: adv7511/cobalt: rename driver name to adv7511-v4l2
> >>>>
> >>>> Thanks Helen for trying to reproduce and sharing the result.
> >>>
> >>> Me and Helen found out what is the problem. If you follow this call trace:
> >>>
> >>> vimc_ent_sd_unregister()
> >>> v4l2_device_unregister_subdev()
> >>> v4l2_subdev_release()
> >>>
> >>> You'll notice that this last function calls the `release` callback
> >>> implementation of the subdevice. For instance, the `release` of
> >>> vimc-sensor is this one:
> >>>
> >>> static void vimc_sen_release(struct v4l2_subdev *sd)
> >>> {
> >>>     struct vimc_sen_device *vsen =
> >>>                 container_of(sd, struct vimc_sen_device, sd);
> >>>
> >>>     v4l2_ctrl_handler_free(&vsen->hdl);
> >>>     tpg_free(&vsen->tpg);
> >>>     kfree(vsen);
> >>> }
> >>>
> >>> And then you can see that `vsen` has been freed. Back to
> >>> vimc_ent_sd_unregister(), after v4l2_device_unregister_subdev(), the
> >>> function will call vimc_pads_cleanup(). This is basically a
> >>> kfree(ved->pads), but `ved` has just been freed at
> >>> v4l2_subdev_release(), producing a memory fault.
> >>>
> >>> To fix that, we found two options:
> >>>
> >>> - place the kfree(ved->pads) inside the release callback of each
> >>> subdevice and removing vimc_pads_cleanup() from
> >>> vimc_ent_sd_unregister()
> >>> - use a auxiliary variable to hold the address of the pads, for instance:
> >>>
> >>> void vimc_ent_sd_unregister(...)
> >>> {
> >>>      struct media_pad *pads = ved->pads;
> >>>      ...
> >>>      vimc_pads_cleanup(pads);
> >>> }
> >>>
> >>>
> >>
> >> I fixed a problem in the thirds patch. vimc-capture uses the first
> >> approach - placing the kfree(ved->pads) inside the release callback.
> >>
> >> I am debugging another such problem in unbind path while streaming.
> >> I am working on v2 and I will look for the rmmod problem and fix it.
> >>
> >> thanks again for testing and finding the root cause.
> >> -- Shuah
> > 
> > Hi Andre,
> > 
> > Here is what's happening.
> > 
> > Before this change, you can't really do rmmod vimc, because vimc is in
> > use by other component drivers. With the collapse, now you can actually
> > do rmmod on vimc and this problem in vimc_ent_sd_unregister() that frees
> > pads first and the does v4l2_device_unregister_subdev().
> > 
> > I fixed this in the 3/3 patch. I can reproduce the problem with patches 1 
> > and 2, and patch 3 fixes it.
> > 
> > Did you test with the third patch in this series?
> 
> yes, we tested with 3/3, but the new problem now is when doing the following
> in this order:
> 
>     v4l2_device_unregister_subdev(sd);
>     vimc_pads_cleanup(ved->pads);
> 
> 
> v4l2_device_unregister_subdev() calls the release function of the subdevice 
> that
> frees the ved object, so ved->pads is not valid anymore. That is why André 
> suggested
> a temporary variable to hold ved->pads and to be able to free it later:
> 
>     struct media_pad *pads = ved->pads;
> 
>     v4l2_device_unregister_subdev(sd);
>     vimc_pads_cleanup(pads); // So we don't use the ved object here anymore.

Can't you simply call vimc_pads_cleanup() in the release function of the
subdevice before freeing the ved object ?

-- 
Regards,

Laurent Pinchart

Reply via email to