Hello Dave,
Please find my response inline.
>I have not had time to look at your fixes just yet, and will be out for the
>rest of the day. But the way it is *supposed* to work is this:
>
> if (stdata_locked)
> {
> lis_up(&lis_stdata_sem); /* let go global semaphore */
> stdata_locked = 0 ;
> }
>This is where the stdata semaphore is released. From here on the stream
>head is findable in the list of stream heads. The sd_dev field indicates
>the original dev being opened.
Is there not a window here between the time the stdata sem is released
and the sd_opening sem is accquired by the opening thread? This window
could allow another contending open thread to take this stdata and
modify its contents. In the event of a failure in the driver open for
the second thread, we could go thro the close path.
The close routine examines the 'opencnt' to determine if the stream has
to be closed. If the second thread takes the error path in lis_stropen,
it can come to "error_rtn:" and potentially dismantle the stream (since
'opencnt' would not have been incremented yet in the other open thread
which had the stream only half-open). Even if it did not lead to
deallocation of the stream, allowing a second open thread to modify the
stdata element when the sd_dev field has not yet been finalized would
lead to problems at the driver level. For all practical purposes, this
stream should be protected from contending opens atleast till the sd_dev
field stabilizes in the stdata structure. This means holding off other
opens on this stream till sd_dev has been finalized on the final call
to the driver open. After this point, if any user attempts an open on
such an existing stream, the results would be as deemed fit by the
driver, since LiS would reuse this stdata. Would you agree with this?
Regarding details to reproduce the problem :-
1. Please patch head.c with the attached patch (just to reproduce the
problem).
2. Patch minimux.c (with the patch from the earlier posting) to build LiS.
3. 'make test' to build the test app.
4. gcc -I <LiS_include_dir/sys> set_sleep_time.c -o set_sleep_time -lpLiS
to build the app to tweak the minimux flag.
5. ./set_sleep_time
6. <Window_1> - while true; do ./test ; done
<Window_2> - while true; do ./test ; done
<Window_3> - while true; do dmesg | grep "Found head"; done
Once you hit the problem, you would see this printf. Ideally, this
should not appear. We expect this tweaked minimux to always give us
a new stream. Any stdata reuse => the lis_stropen found another thread
already using the specific sd_dev. But since minimux always modifies
sd_dev in open, I expect this to never happen. If it happens, it is
due to the fact that we allow a second open thread to creep in when
the first open is yet to finalize on the sd_dev for the stream under
consideration.
After this, use the LiS fix patch and rerun the tests. The dmesg should
not appear.
Hope this helps.
Patch for head.c to reproduce problem :-
----------------------- start ---------------------
--- head.c.orig Sat Jun 28 06:24:57 2003
+++ head.c.reproduce_problem Mon Jun 30 04:43:26 2003
@@ -322,6 +322,16 @@
/* ------------------------------------------------------------------- */
/* Glob. Vars */
+/* dbg - VS */
+
+#define DEBUG_OPEN_RACES
+
+#ifdef DEBUG_OPEN_RACES
+/* Lock to avoid open/close races with minors usage in minimux.
+ */
+lis_spin_lock_t mux_minor_lock;
+#endif /* DEBUG_OPEN_RACES */
+
lis_spin_lock_t lis_stdata_lock ; /* protects global tables */
lis_spin_lock_t lis_qhead_lock ; /* protects qhead, qtail & scanq */
lis_spin_lock_t lis_incr_lock ; /* protects certain variable incrs */
@@ -3730,8 +3740,10 @@
existing_head = 0 ;
head = lis_head_get(NULL) ; /* allocates new structure */
}
- else
+ else {
+ printk ("Found head 0x%p\n", head);
lis_head_get(head) ; /* incrs ref count */
+ }
if (!head) /* can't proceed w/o a strm head */
{
@@ -7435,6 +7447,10 @@
lis_sem_init( &lis_stdata_sem, 1 );
lis_sem_init( &lis_close_sem, 1 );
lis_spin_lock_init(&lis_stdata_lock, "LiS-Stdata") ;
+/* dbg - VS */
+#ifdef DEBUG_OPEN_RACES
+ lis_spin_lock_init(&mux_minor_lock, "Mux-Minor-Lock") ;
+#endif
lis_spin_lock_init(&lis_qhead_lock, "Qhead-Lock") ;
lis_spin_lock_init(&lis_incr_lock, "Incr-Lock") ;
lis_spin_lock_init(&lis_msg_lock, "Msg-Lock") ;
--------------------- end -------------------------
Regards,
Srinivasan.
>
> if (!hd_locked && (err = lis_down(&head->sd_opening)) < 0)
> goto error_rtn ;
>And then we get the head-specific semaphore that serializes opens. There
>could be a race here between simultaneous opens. But only one thread wins
>the race.
>
>
>reuse_head: /* changed maj/mnr from clone open */
>If some other thread closed the stream while we were pending on the
>semaphore then it will be marked as such with this flag. If we observe
>this then we just start over, likely allocating a new stream for this open.
> if (F_ISSET(head->sd_flag,STRCLOSE))
> {
> SET_FILE_STR(f, NULL); /* unhook from file */
> CP(head,odev) ;
> lis_up(&head->sd_opening) ; /* unlock head struct */
> hd_locked = 0 ;
> lis_head_put(head) ; /* give back use count */
> goto retry_from_start ;
> }
>
> if (LIS_SD_OPENCNT(head) >= 1)
>This code detects that another stream has opened this dev already. It
>could have been long ago or it could have been another thread winning the
>"simultaneous open" race. In either case we just add to the open count and
>use the existing stream.
>
>The close routine sets the STRCLOSE flag under protection of the sd_opening
>semaphore.
>
>So what I don't see in all this is how your case arises. Obviously there
>is something that has been overlooked, but the code seems to me to be very
>close to correct in resolving these races.
>
>Is it possible for you to construct the execution order in more detail? Or
>can someone else spot the logical hole?
>
>Thanks,
>Dave
>
>
>At 06:49 AM 6/29/2003 Sunday, Sahasranamam Srinivasan V wrote:
>>Hello Dave,
>>
>>Continuing with tests with LiS-2.16.8, I still faced problems
>>with open races. This is due to the fact that there is a small
>>window where we release the stdata sem and then accquire the
>>sd_opening sem (the window happens to be btwn line# 3807 to line# 3815).
>>In this window, the head remains suceptible to changes due to other
>>contending open threads being able to see this head on the stdata
>>list (which is anyway accessible since stdata sem has been released).
>>So, any failures on the second thread can lead this stream head to
>>be taking the close path (and potentially getting deallocated?), leading
>>to inconsistencies for the original thread that started with the open.
>>In order to avoid this, I have tried the following :
>>* For all fresh opens where a stream head is allocated newly, flag it
>> as "open in progress" by setting the STWOPEN bit in 'sd_flag' field.
>> This was an unused bit defined in head.h. I just used it here. Please
>> note that this flag is set only for fresh stdata, right in the very
>> begginning in lis_alloc_stdata, before it goes into the stdata list.
>> This way, we can be sure nobody gets to see this before it has been
>> flagged.
>>* In lis_lookup_stdata(), skip those streams which are thus flagged (just
>> as it is being done for closing streams when STRCLOSE is set).
>>* Clear the STWOPEN bit appropriately (both error paths and success paths).
>> This is done only after all opens have been accounted for, including the
>> clone opens, i.e., till the 'sd_dev' field has stabilized with the final
>> value for the device id. This way, we also avoid returning EINVAL for
>> allowing opens to happen on half-open streams (those for which the final
>> value of sd_dev has not yet been decided).
>> This situation is probably what John Boyd had tried to elaborate upon in
>> an earlier discussion thread on this mailing list. Hopefully, these changes
>> fixup all those windows.
>>
>>I have tested this code with a tweaked version of minimux.c and a small test
>>program that opens and closes the minimux device. Tests with my driver were
>>also ok with this small modification. Please do go thro the attached patch
>>and test code and let me know if it is good enough.
>>
>>PS: For the test code, I had to introduce a lock in minimux driver to fix
>>open/close races for minor allocation (mux_minor_lock). This is nothing to
>>do with the actual fix for open races in LiS. This is only for testing the
>>minimux driver (though it may not be a bad idea to retain it there). Also,
>>the test code for minimux is under "#ifdef DEBUG_OPEN_RACES". The LiS changes
>>do not have any such ifdef's. I have hardcoded the DEBUG_OPEN_RACES define
>>in both minimux.c and head.c since I did not want to meddle with the Makefile
>>for a test build. I also added a new ioctl in minimux.c for enabling it
>>to simulate sleep in opens based on a flag. This is again a test code. I
>>found this minimux to be quite useful as a test aid. So I added this ioctl
>>instead of hard-coding the changes, so that it can be conditionally compiled
>>for test or production with a common src. You will find 2 test apps attached.
>>This is how you can reproduce the problem :-
>>
>>Build LiS without the STWOPEN changes. Use the modified minimux.c for this
>>build. But please note that you will need to retain the "mux_minor_lock"
>>declaration and the debug printf in lis_stropen() for this test. Only the
>>STWOPEN changes should be removed for reproducing the problem.
>>
>>(** All patches are against LiS-2.16.8 **)
>>
>>Compile the test app set_sleep_time.c and run it first. This enables the
>>flag in minimux driver to sleep in open.
>>Compile and run test.c. Run two parallel instance of this in a tight loop
>>(while (1)). Run 'dmesg | grep "Found head"' in a tight loop in another
>>window. You will see this printf as soon as the problem is hit.
>>With the changes in LiS, this should not appear.
>>Hope this helps.
>>
>>Thanks and Regards,
>> Srinivasan.
_______________________________________________
Linux-streams mailing list
[EMAIL PROTECTED]
http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams