I've gone back over some notes and checked the Timesys kernel. We ran into the large semaphore around August'02. In hindsight it is caused by a Timesys extension that is enabled when CONFIG_PRIO_INHERIT is defined. However in our case basing the compile time switch on PPC worked and so we didn't need a special patch.
It's all very historical as by October'02 we had concluded that timesys pre-emptive scheduling in the kernel was incompatible with STREAMS and abandoned that kernel, as we found a closer to stock kernel with all the BSP stuff we needed. In December'02/January'03 there was a detailed discussion here on SMP race conditions and Dave reworked LiS to work in an SMP environment, some of these fixes may have had the side effect of solving the problems in a Timesys pre-emptive kernel but we never went back to check. So I don't know what the state of the current kernel is from Timesys vis-a-vis LiS but back in the days of Linux 2.4.7 the large semaphore should have been based on CONFIG_PRIO_INHERIT rather than PPC. Ragnar ----- Original Message ----- From: "Dave Grothe" <[EMAIL PROTECTED]> To: "Ragnar Paulson" <[EMAIL PROTECTED]> Cc: "LiS Mailing List" <[EMAIL PROTECTED]> Sent: Wednesday, February 18, 2004 5:43 PM Subject: [Linux-streams] Spin Lock and Semaphore Space > Ragnar: > > What was the resolution of this? Do you patch LiS for TimeSys Linux? Is > there something that I could put into the LiS Configure script that would > sense this version and change the size of the kernel semaphore space array? > > What is the correct size for TimeSys? > > At the moment I have the space set to 20 long words. The 2.6 kernel needs > 11. For a 64-bit system it would be more, maybe 15 or 16. So 20 will hold > us for a stock 2.6 kernel and give people a chance to start using > lis_sem_alloc() to get around the problem. > > Spin locks in 2.6 take 6 long words (8 in 64 bit). I have also upped the > LiS spin lock space from 7 to 16 long words to allow for this. The same > problem occurs here with spin locks embedded in STREAMS driver private > structures. If you use lis_spin_lock_alloc() and a pointer to a spin lock, > instead of an embedded one, then you will always be OK. > > -- Dave > > At 10:42 AM 2/18/2004, you wrote: > > >Matt and Dave, > > > >Much to my chagrin, I am responsible for the PPC semaphore definition. I > >believe I have mentioned in this list before that > >that large semaphore turned out to be a requirement in the TimeSys kernel > >we were using and not related to the PPC at all. There are features in > >that version of Linux (pre-emptive kernel scheduling, and real-time > >profiling) that bloated the semaphore. Without > >those features the large semaphore is not required. > > > >As for how large it should be in general ... you need to know that a > >kernel defined semaphore must be smaller than the LiS semaphore. As with > >every data structure since the concept was conceived ... if you assign an > >arbitrary limit, eventually someone will add enough fields to break it. :-) > > > >Ragnar > > > > > >----- Original Message ----- > >From: "Dave Grothe" <[EMAIL PROTECTED]> > >To: "Matthew Gierlach" <[EMAIL PROTECTED]> > >Cc: "LiS Mailing List" <[EMAIL PROTECTED]> > >Sent: Wednesday, February 18, 2004 10:29 AM > >Subject: [Linux-streams] Re: SMP Panic we discussed > > > > > > > Matt: > > > > > > I am posting an edited version of this exchange to the group since I think > > > the information may be generally useful. > > > > > > It's in /usr/src/linux/include/asm/semaphore.h. Now that you mention it, > > > and now that I look at the components of the kernel semaphore_t structure, > > > 12 long words looks to be a little tight. You might try increasing > > that to > > > 20 just to see what happens. > > > > > > I will do likewise in LiS-2.17. > > > > > > I don't want the size of the lis_semaphore_t or lis_spin_lock_t structures > > > to depend upon the kernel version. Doing so would make STREAMS drivers > > > dependent upon the kernel version and would force compilation from source > > > on the target machine. So I just want to leave room in the LiS structures > > > for a kernel structure to fit in. > > > > > > By the way, if you use the lis_sem_alloc() routine it will allocate enough > > > space for the kernel semaphore no matter what array size is in the > > > lis_semaphore_t structure. This is a way to guarantee that the structure > > > is compatible without your driver having any knowledge of kernel semaphore > > > structures. > > > > > > -- Dave > > > > > > At 07:23 PM 2/17/2004, Matthew Gierlach wrote: > > > > > > >Hi Dave: > > > > > > > > Would a change in the semaphore type in RH EL 3.0 produce this > > > > behavior? I've looked at the definition of lis_semaphore_t in > > > > LiS. It does not reference the Linux semaphore type, but it does > > > > reserve 12 long words for non-PPC compilations and 50 long words > > > > for PPC compilations. These long words are referenced as the > > > > semaphore > > > > in the LiS code. > > > > > > > > Why the difference between PPC and non-PPC? > > > > > > > > What files would I look in in the EL kernel source to determine > > > > if this is a compatibility issue? > > > > > > > > Thanks, Matt > > > > > > > >On Tue, 17 Feb 2004, Dave Grothe wrote: > > > > > > > > > No. I was seeing different symptoms. What I ran into was a chain of > > > > > events that looked like runqueues calls service procedure calls kernel > > > > > utility calls schedule(). But runqueues was holding a spin lock on the > > > > > queue. So schedule() bumped the runqueues thread off the > > CPU. That caused > > > > > a hang because of N other threads that wanted the same lock. So > > all CPUs > > > > > were spinning on the lock and the only thread that would release > > the lock > > > > > was scheduled off the CPUs. I have since changed the queue lock to a > > > > > semaphore, plus a few other things that minimize this kind of > > contention in > > > > > the first place. > > > > > > > > > > Your case, on the surface, looks like spin locks are not working on > > your > > > > > system. The message from LiS is an assertion failure that should never > > > > > print out in the absence of contention for a queue head which is > > otherwise > > > > > protected by a spin lock. I have never seen the message that you are > > > > seeing. > > > > > > > > > > Is there something about your machine (caching? hardware locking? > > memory > > > > > access sequencing) that would make the Linux implementation of spin > > locks > > > > > fail? My gut feel is that you are looking for something very near the > > > > > hardware here. Do you have another 2 CPU XEON machine to try it > > on? I am > > > > > using an IBM x335. Take a careful walk through your machines setup > > menus > > > > > to see if there is some BIOS option that might affect multiple > > requestors > > > > > to memory. > > > > > > > > > > Remember, Sun builds SPARCs and is used to thinking about memory in a > > > > > different way that us Intel guys. > > > > > > > > > > -- Dave > > > > > > > > > > > > > > > > >-------------------------------------------------------------------------------- > > > > > > > > > > --- > > > Outgoing mail is certified Virus Free. > > > Checked by AVG anti-virus system (http://www.grisoft.com). > > > Version: 6.0.591 / Virus Database: 374 - Release Date: 2/17/2004 > > > > >_______________________________________________ > >Linux-streams mailing list > >[EMAIL PROTECTED] > >http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams > > > > > >--- > >Incoming mail is certified Virus Free. > >Checked by AVG anti-virus system (http://www.grisoft.com). > >Version: 6.0.591 / Virus Database: 374 - Release Date: 2/17/2004 > > -------------------------------------------------------------------------------- > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.591 / Virus Database: 374 - Release Date: 2/17/2004 > _______________________________________________ Linux-streams mailing list [EMAIL PROTECTED] http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
