John,
Linux Fast-STREAMS is patched directly into the kernel.
For native FIFOs, init_special_inode is patched to use fifo_f_ops
instead of def_fifo_fops. For PIPEs, do_pipe() is patched to use stream
pipes instead of linux native pipes.
I added an i_stream to the inode structure (similar to i_cdev and
i_pipe) that points to an stdata (stream head). Inside the special
filesystem for streams, this is used to point to the stream head
associated with the inode.
For character based FIFOs (S_IFCHR in the read filesystem), fifoopen
attaches the stream head directly to the filesystem inode on i_stream.
On open the opener takes the i_sem semaphore one the inode so that
two openers of the same FIFO (whether S_IFIFO or S_IFCHR) are
serialized in case the big kernel lock is not held (which it is in
the case of character device opens, see chrdev_open(), but must be
released so that the open can sleep in the case of qwait(), qwait_sig()
and fifoopen()).
Then the simple SVR 4 vnode rule applies: if there is a already a
v_stream associated with the vnode, it is used and qi_open is called with
the existing stream head/queue pair; if there is no v_stream associated
with the vnode, a stream head is created and qi_open is called with the
newly created stream head/queue pair. In SVR3-like Linux we have inodes,
not vnodes (a BSD concept), and I have added the i_stream pointer to
parallel the v_stream pointer.
So, for S_IFIFO and S_IFCHR FIFOs, if the FIFO is already open, the
stream head is attached at i_stream. If the FIFO has not been opened
yet, i_stream is NULL. This means that S_IFCHR FIFOs act in the same
way as S_IFIFO fifos: if you have two distinct inodes in the filesystem
regardless of the fact that they share the same major/minor device
number, they each represent distinct FIFOs.
LiS attaches stream heads to the u.generic_ip pointer, which
unfortunately is part of the file-specific union in the inode structure.
If LiS attempts to attach a stream head to a real filesystem inode the
way that Linux Fast-STREAMS is doing it, it will risk corrupting the
real filesystem's file system specific data in the inode.
Linux Fast-STREAMS changes the kernel source. LiS cannot so easily.
It is back to licensing, LGPL, ...
--brian
On Wed, 08 Oct 2003, John A. Boyd Jr. wrote:
> Do you actually use S_IFIFO? If so, how do you get around the kernel
> hardwiring all S_IFIFO inodes to the kernel's fifo implementation?
>
> Maybe I can pick up a trick from you, if you have a method that
> works.
>
> -John
>
> Brian F. G. Bidulock wrote:
> > John,
> >
> > On Wed, 08 Oct 2003, John A. Boyd Jr. wrote:
> >
> >
> >>I'm not sure I follow all of what you are saying here. I certainly
> >>don't agree with all of it, but leaving disagreements aside, I want
> >>to ask you how you insure with this "stacking" semantic that two
> >>openers who are opening explicitly for the purpose of communicating
> >>with each other via the opened STREAM, will indeed be communicating
> >>via a common stream, and not via separate "stacked" STREAMS.
> >>
> >>More specifically, it would appear to me that FIFOs would not work
> >>with your method, since in order to be useful in general, they require
> >>multiple concurrent opens of a single STREAM.
> >
> >
> > FIFOs are a different mechanism. Ala SVR 4, I attach the stream head
> > directly to the file system inode in that case (whether S_IFCHR or S_IFIFO).
> > But also note that FIFOs are not cloneable.
> >
> > --brian
> >
> >
> >>This is a bug that has crept into LiS on more than one occasion; namely,
> >>that someone would reopen a FIFO or pipe end expecting to be able to
> >>communicate with prior openers, but end up not communicating at all.
> >>The reason they weren't communicating is that they opened entirely
> >>different FIFOs or pipes, not the same one.
> >>
> >>You impression of 'connld' also doesn't make sense to me. I wrote
> >>the LiS implementation, and what you describe as its semantics don't
> >>match up for me.
> >>
> >>-John
> >>
> >>Brian F. G. Bidulock wrote:
> >>
> >>>Dave,
> >>>
> >>>Here's some feedback. I don't have access to source code for any branded
> >>>Unix. (I'm still in the clean room, which is good.) From the Solaris and
> >>>Unixware documentation, however, it is clear that when the "clone" driver
> >>>is used, that a "new and unique minor number" must be assigned by the
> >>>driver. I have also seen statements (although I can't remember just where)
> >>>that the driver must not assign a new major device number (it appears that
> >>>only "clone" is allowed to do this). In Magic Garden it explains that the
> >>>SVR4 STREAMS executive performs a setq() after qi_open of the driver if the
> >>>major device number changes, but this is for "clone" I believe. LiS "clone"
> >>>peforms the setq() itself within its qi_open routine.
> >>>
> >>>I believe the original purpose in LiS of redirecting to an open (or existing)
> >>>stream head via the device number was for the "connld" module, which opens
> >>>a pipe and then uses the pipe's character device number (major and minor) to
> >>>redirect the LiS executive to use the new pipe end. In SVR4 pipes use a
> >>>different file system from normal streams devices, even if they are
> >>>STREAMS-based bi-directional pipes.
> >>>
> >>>Unixware documentation, on the other hand talks of "clonable minors" and
> >>>"clone channels".
> >>>
> >>>In practice if the minor device number is really meant to be clonable, then
> >>>there is no reason to assign a new or unique device number. The following is
> >>>pretty much the situation:
> >>>
> >>>When opened via "clone" the driver must assign a new and unique device number.
> >>>If it does not, yet does not return an error number to qi_open() then the
> >>>executive should close the device by calling qi_close() and return ENXIO to
> >>>the open(2) call, particularly if filesystem corruption would result down
> >>>the road.
> >>>
> >>>When a minor device is marked as cloneable (SAP_CLONE directive to the uw7
> >>>SAD driver, or implementation specific d_flags on initialization), then the
> >>>driver should be permitted to reuse the same minor device number. This has
> >>>one ramification for clonable minors: only one stream head can be accessed
> >>>by an explicit open of the minor device. Clonable minors can get around this
> >>>my marking the minor clonable and never permitting an explicit open of the
> >>>minor device (a new stream head is always created when the minor is opened).
> >>>The special file system must of course permit this.
> >>>
> >>>For "clone," because minor devices can subsequently be opened explicitly
> >>>without a clone open (DRVOPEN), there should be one stream head per special
> >>>file. It is possible, however, to even accomodate this by stacking stream
> >>>heads behind the special file and only permitting explicit open of the
> >>>topmost stream-head in the stack.
> >>>
> >>>This latter is the approach that I took with Linux Fast-STREAMS. If a
> >>>"clone" open returns a device number that already exists, the new stream
> >>>head is used and the stream head is "stacked" on the special file. A clone
> >>>open (whether via "clone" or a clonable minor) will always result in a new
> >>>stream head "stacked" in this fashion. In the case of "clone" there should
> >>>be only one stream head in the stack if the documentation rules ("new and
> >>>unique minor device number assigned") are followed. An explicit open of the
> >>>device number (DRVOPEN) is performed, then the topmost stream head in the
> >>>"stack" has its qi_open() procedure called.
> >>>
> >>>This suits my pusposes well and seems consistent with the uw7 documentation.
> >>>
> >>>I have the difficulty that I need to open thousands of streams for
> >>>configuration of SS7 protocol stacks. Under LiS, I used to dynamically assign
> >>>new major device numbers (sometimes on demand), and used the capability that
> >>>LiS allows you to reassign the major device number on open. (In fact LiS
> >>>allows you to reassign the major and minor device number on a DRVOPEN, which
> >>>made things easier). This is consistent with the "new and unique"
> >>>requirement. The strinet driver does this too. Although it has clone minors
> >>>(/dev/tcp, /dev/udp etc. are all minor device numbers under the socksys major)
> >>>a new unique major/minor device number combination is assigned for each
> >>>stream.
> >>>
> >>>One problem with this approach was that I had applications where I needed to
> >>>open 10's of thousands of streams, linked under multiplexing drivers and the
> >>>like. Almost all these streams are pseudo-devices and few need to be opened
> >>>explicitly via major/minor device number. The system could quite quickly run
> >>>out of device numbers altogether it the "new and unique" approach is used.
> >>>(Linux at one time used to associate a character device with each socket
> >>>opened. Failure to scale is why it was dropped.) The clonable minor approach
> >>>allows any number of stream heads to be stacked behind the same minor device
> >>>number, yet is compatible with the old "clone" method.
> >>>
> >>>It look's like "clone" goes all the way back to Ritchie's Research UNIX,
> >>>where the original "new and unique" requirement came from.
> >>>
> >>>It is in fact a lot cleaner that if a clone opened driver returns an existing
> >>>major/minor device number that the stream head be "stacked" on that special
> >>>file rather than anything else. This is because if the driver has returned
> >>>zero (0) it has attached (or could have attached) private structures to the
> >>>passed in queue pair. This stream head should not be freed without calling
> >>>the driver's qi_close() procedure so that it can remove its private
> >>>structures from the queue pair. So I suppose the choice is to "stack" the
> >>>stream or to call qi_close() and return ENXIO to open(2). I think the only
> >>>reason LiS is doing any different right now is because "connld" uses this
> >>>corner to redirect the open to the new existing pipe end. I rewrote "connld"
> >>>for Linux Fast-STREAMS so that this behavior was no longer necessary ("connld"
> >>>more directly installs the stream head for the new pipe end against the file
> >>>descriptor.) This was a lot easier (it seems) in SVR 4 because SVR 4 returns
> >>>the vnode as a return to c_open calling spec_open. Linux must actually
> >>>install dentry and vfsmount against the file pointer.
> >>>
> >>>One more note about stacking stream heads: the file->private_data pointer
> >>>always points to the stream head in the stack that belongs to an open file
> >>>descriptor. Thus a file pointer only references one stream head during its
> >>>lifetime. An inode, on the other hand can reference multiple stream heads.
> >>>
> >>>I wish I could point you to the code for Linux Fast-STREAMS so you could
> >>>follow what I have done as well, but I'm still packaging for release...
> >>>
> >>>--brian
> >>>
> >>>
> >>>On Mon, 06 Oct 2003, Dave Grothe wrote:
> >>>
> >>>
> >>>
> >>>>I have the old multi-threaded open/close race problem fixed, or at least I
> >>>>will claim that it is fixed until someone produces an example that
> >>>>demonstrates otherwise.
> >>>>
> >>>>Before releasing the LiS incorporating these changes I need to solicit your
> >>>>opinions on a matter.
> >>>>
> >>>>When a user process opens a specific stream {maj,min} for a second or
> >>>>subsequent open, LiS (and Solaris) calls the driver open routines for all
> >>>>the modules in the stream to announce this fact. These open calls are not
> >>>>balanced by calls to the driver close routine, since the close routine is
> >>>>called only on the LAST close on the stream.
> >>>>
> >>>>So far, so good.
> >>>>
> >>>>When a clone open occurs the initial open is to {clone,maj} with the
> >>>>driver's open routine being called with {maj,0} and the clone flag
> >>>>set. The driver returns an updated {maj,min} to use. If this {maj,min} is
> >>>>currently not open then STREAMS simply uses the stream head structure that
> >>>>was originally for {clone,maj} to operate {maj,min}.
> >>>>
> >>>>So far, so good.
> >>>>
> >>>>But what happens if a clone open returns a {maj,min} that is already
> >>>>open? At the moment, LiS does NOT call the open routine of the driver that
> >>>>owns {maj,min}, on the theory that the driver open routine was already
> >>>>called via the clone driver.
> >>>>
> >>>>But is that correct? An alternate theory is that a clone open to an
> >>>>already-open {maj,min} could be seen as a indirect open to that stream and
> >>>>that normal re-open semantics would be to call the open routines.
> >>>>
> >>>>Does anyone have any direct evidence of what other STREAMS implementations
> >>>>do in this case? I don't have a Solaris test case for this.
> >>>>
> >>>>Thanks,
> >>>>Dave
> >>>>
> >>>>
> >>>>_______________________________________________
> >>>>Linux-streams mailing list
> >>>>[EMAIL PROTECTED]
> >>>>http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
> >>>
> >>>
> >>
> >>_______________________________________________
> >>Linux-streams mailing list
> >>[EMAIL PROTECTED]
> >>http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
> >
> >
>
>
> _______________________________________________
> Linux-streams mailing list
> [EMAIL PROTECTED]
> http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
--
Brian F. G. Bidulock � The reasonable man adapts himself to the �
[EMAIL PROTECTED] � world; the unreasonable one persists in �
http://www.openss7.org/ � trying to adapt the world to himself. �
� Therefore all progress depends on the �
� unreasonable man. -- George Bernard Shaw �
_______________________________________________
Linux-streams mailing list
[EMAIL PROTECTED]
http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams