Hi Konstantin,

On 01/16/15 09:03, Konstantin Belousov wrote:
On Thu, Jan 15, 2015 at 01:14:39PM +0100, Hans Petter Selasky wrote:
On 01/15/15 12:51, Konstantin Belousov wrote:
On Thu, Jan 15, 2015 at 11:49:09AM +0100, Hans Petter Selasky wrote:

I see no leakage in that case either!
Because these cases come through destroy_dev().


Are there more cases which I don't see?
You are breaking existig devfs KPI by your hack.  You introduce yet another
reference on the device, which is not supposed to be there.

Hi Konstantin,

I need a non-sleeping way to say a character device is no longer
supposed to be used and be able to re-use the device name right away
creating a new device. I guess you got that.
Yes, I got it.  The devfs design is not suitable for this, and your
hack is the good witness of the fact.

My opinion is that you should have tried to handle the issue at the driver
level, instead of making this devfs issue.  I.e., if you already have
cdev node with the correct name, driver should have replaced the private
data to point to new device.

I think this way cannot be implemented in a clean way, because of locking order reversal. And if you try to avoid the LOR you end up with the race. Chess mate sort of ;-) Let me explain:

Thread 1:
usb_sx_lock();
  cdev = Look in freelist for existing device();
else
  cdev = make_dev();
usb_sx_unlock();

Thread 2:
usb_sx_lock();
put cdev on freelist
usb_sx_unlock();

Thread 3:
usb_sx_lock();
cdev = remove first entry in freelist
usb_sx_unlock();

/*
 * XXX because USB needs to call destroy_dev() unlocked we
 * are racing with Thread 1 again
 */
destroy_dev(cdev);


This would also close a window where /dev node is non-existent or operate
erronously.

I'm not saying I plan so, but I think "cdevs" at some point need to understand mutexes and locks. That means like with other API's in the kernel we can associate a lock with the "cdev", and this lock is then used to ensure an atomic shutdown of the system in a non-blocking fashion. In my past experience multithreaded APIs should be high level implemented like this:

NON-BLOCKING methods:
lock(); **
xxx_start();
xxx_stop();
unlock();

BLOCKING methods:
setup(); // init
unsetup(); // drain

Any callbacks should always be called locked **

In devfs there was no non-blocking stop before I added the delist_dev() function.


You do not understand my point.

I object against imposing one additional global reference on all cdevs
just to cope with the delist hack.  See the patch at the end of the message.

It's fine by me.

WRT destroy_dev_sched_cb() calling delist_dev(), even after calling
delist_dev(), the node still exists in the /dev. It is only removed
after populate loop is run sometime later. dev_sched() KPI is inheritly
racy, drivers must handle the races for other reasons.

The populate loop is all running under the dev_lock() from what I can see and make_dev() is also keeping the same lock when inserting and removing new cdevs. The populate loop should always give a consistent view of the character devices available, and I don't see how "cdev" structures without the CDP_ACTIVE flag can appear with recently created ones, even if the name is the same.



Entry can be random, since after the populate loop is ran, your code in
other thread could start and create duplicate entry. There is a window
in the lookup where both directory vnode lock and mount point sx locks
are dropped. So running the populate does not guarantee anything.

If there is such a race, it is already there! My patch changes nothing
in that area:

Thread1:
Calls destroy_dev() and clears CDP_ACTIVE, after dev_unlock() it goes
waiting for some refs for xxx milliseconds.
Thread2:
Tries to create create a new character device having the same name like
the one in thread1. Device name duplication check is missed because
CDP_ACTIVE is cleared. Still thread1 is waiting.
Thread3:
Tries to open character device created by thread2 while thread1 is still
waiting for some ref held by a userspace app to go away.

This can happen already before my patches! What do you think?
Possibly.



At what level do you mean duplicate names, I don't get this fully? At
the directory level (DE nodes)? Or inside the list of character devices
(LIST_XXX)?
It does not matter, dup at either one directory level, or dup of full
names in the global list are equivalent (bad) things.

Like I write above I don't see where the problem is. At the cdev level, we are protecting the cdev's LIST with dev_lock() and only one entry will exist having CDP_ACTIVE bit set per unique cdev name and path. Else we will hit a panic in make_dev() and friends.

In the directory entry level the populate loop will also ensure a consistent view, and hence the cdev's LIST is consistent, the view presented to userspace will also be consistent.

That system functions can still call into the dangling read/write/ioctl functions is another story, and that is why I tell, that in order to simplify this teardown, we possibly should associate a client selectable lock with each CDEV, for teardown purposes. Like done for several years in the callout and USB APIs and possibly many other places.



Let me summarize:
- the extra reference on arbitrary cdev should be eliminated. The
    delist_dev_locked() may add the ref and set some CDP_ flag to
    indicate to destroy_dev() that it should do dev_rel().

It is possible to do this. I thought about this before doing my patch,
but decided to try to avoid adding yet another cdev flag.

- the existence of the duplicated entries should be either eliminated
    (I am not sure it is possible with your code), or we must ensure
    that only one name with CDP_ACTIVE flag set and given name exists.
    Then, all lookup code must be audited to take CDP_ACTIVE into account
    when accessing names.  I see at least devfs_find() and devfs_mknod()
    which must be changed.  I did not performed full audit.

I will check this path out aswell.


It seems it is simpler for me to try to clean up after the commit.
The patch was only lightly tested.  I post the patch for discussion,
not for you to committing it.  I will expedite the change into HEAD
after the consensus on it is made and adequate testing is performed.

I don't see any problems about your patch, except it adds a bit more code to the kernel.

--HPS
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to