was written for. It should cause much of a problem for a fairly
standard Unix variant.
-John
John A. Boyd Jr. wrote:
I'm neither looking at code as I write this, nor at any documentation, but hopefully the following will add to the discussion, even though it's off the top of my head.
There are two obvious ways cloning can be done: one is via the CLONE major, the other is with the CLONABLE bit set. I don't remember the logic now, but I'd specifically determined (a while ago) that every STREAMS implementation will do one thing for sure when cloning is possible: pass an unused, available STREAM head to the driver open. This is to ensure that if the driver open decides to clone a new STREAM, this right choice will have been made. I take this as an implicit requirement, by the way, until someone can prove differently.
The alternative, as I believe Ragnar has speculated, is unworkable. A STREAM head once finally open must correspond to a single minor device, and LiS accomodates that case by finding the head corresponding to the driver open's returned minor device, if one is found, throwing away the new head, since it isn't needed. But the alternative case can't be so easily accomodated, namely, passing in an open head that is in use when cloning (i.e., changing the minor device) is possible.
But here's the interesting case: cloning does not have to be explicitly specified by one of the above obvious methods - it's always possible for a driver open to change minor device number, whether cloning was indicated or not, and the effect must be the same as cloning. Because of that, all STREAMS implementations that I have used assume this possibility, by always passing a new stream head to a driver open routine (at least, that's my recollection; someone might want to check this, but I do remember setting up tests for it on different systems, none of which I have access to anymore).
If a STREAMS subsystem doesn't always pass a new head to a driver open, it opens itself to the possibility of a driver requesting a new minor device for an already open stream, and changing the minor device of an open stream is not a good idea; it can compromise not just the STREAMS subsystem, but the OS as well.
I would suggest then, that the rule for STREAMS generally, not just LiS, is that a driver's open should be called if a new head is being passed, but only if a new head is being passed. If LiS doesn't now work this way, it should, in my opinion (I know I've changed the code to do this in the past, but I haven't checked this for a long time).
I should note that LiS necessarily treated FIFOs (and thus pipes) as character devices, because it was not possible (although it may now be possible) to replace kernel FIFOs and pipes more directly (although fattach allows similar capability in LiS). Other STREAMS implementations have either not had FIFOs and pipes (Magic Garden predates these capabilities, for example), or used ST_FIFO or the equivalent as a device type for them. That bears on this issue only in that FIFOs and pipes don't have a regular driver open to call in most implementations, at least not one that a user can see. LiS, however, does have one, and in many respects, it behaves normally. Its big difference is that FIFO queues replace the head queues.
Connld is a special case that doesn't otherwise relate to this issue. Connld must provide not just a new head, but an entirely new pipe, at the level of file pointers. It's not a driver in any event; it's a module, and it likely couldn't be implemented as a fully normal module.
I believe a documented rule is that STREAMS heads and inodes must correspond exactly. Files are a different matter, but most Unixen allow multiple (file-level) opens of a single inode. Linux differs at this level a bit, but Linux still has inodes, and still uses them pretty much conventionally, ignoring the details.
It's certainly not impossible for non-FIFO STREAMS to be reopened. It's also not against the rules for a driver to change not just the minor number, but the major as well. It may not be recommended, but it's certainly possible, and I believe intentionally so. Changing the major is even necessary when the CLONE major is used, just to state the obvious case. It is only an error when the driver changes the major to a non-STREAMS major, which can be checked for.
Hopefully, this will suggest come tests that one might do on other systems, e.g., changing the STREAMS major as well as the minor, and checking the queue addresses when trying to reopen an already open STREAMS minor (a different queue address passed in suggests a different head as well). One might also write a module test that checks which queue addresses are passed to its open (I'm sure I've done these tests before, but I don't know where the code would be, and it wouldn't belong to me anyway, since I wrote it under a work-for-hire agreement).
-John
David Grothe wrote:
Brian:
Thanks for the info.
At the moment I have it working such that LiS will call the driver's open routine if a clone open changes the {maj,min} to that of an existing open stream. This maps the semantics of this situation to that of the user performing a directed open to that same {maj,min}. This allows the driver to know that a re-open is being attempted, and to reject it if such things are not allowed. If the open succeeds then the stream head is shared between the two files with appropriate open-count increments in the stream head structure. Thus, the driver's close routine gets called once both files close the stream.
One of my "round tuit" projects is to re-work LiS' notion of the dev_t structure such that it uses, say, 10 bits for major and 22 bits for minor, or 12/20. The idea being to allow nearly unlimited number of minor devices under a single major. LiS can do this because it has file system status in the kernel and can manage its own major/minor numbers in its own inode structure.
The /dev entries will still have to fit within Linux's more restricted scheme, but we really need the large number of minors for clone devices, not in order to have a zillion entries in the /dev directory.
I want to have a version of LiS in which this is the only change from the previous since it will necessitate driver recompilation to take advantage of the new feature. At my present rate of speed it will be at least Thanksgiving before this sees the light of day.
I haven't looked at the 2.6 kernel since it was about 2.5.60-something, but it seems that the kernel guys are sticking with the 8/8 form for kdev_t for another release. Is that still the case in 2.6?
-- Dave
At 03:25 PM 10/6/2003 Monday, Brian F. G. Bidulock wrote:
Dave,
Here's some feedback. I don't have access to source code for any branded
Unix. (I'm still in the clean room, which is good.) From the Solaris and
Unixware documentation, however, it is clear that when the "clone" driver
is used, that a "new and unique minor number" must be assigned by the
driver. I have also seen statements (although I can't remember just where)
that the driver must not assign a new major device number (it appears that
only "clone" is allowed to do this). In Magic Garden it explains that the
SVR4 STREAMS executive performs a setq() after qi_open of the driver if the
major device number changes, but this is for "clone" I believe. LiS "clone"
peforms the setq() itself within its qi_open routine.
I believe the original purpose in LiS of redirecting to an open (or existing)
stream head via the device number was for the "connld" module, which opens
a pipe and then uses the pipe's character device number (major and minor) to
redirect the LiS executive to use the new pipe end. In SVR4 pipes use a
different file system from normal streams devices, even if they are
STREAMS-based bi-directional pipes.
Unixware documentation, on the other hand talks of "clonable minors" and "clone channels".
In practice if the minor device number is really meant to be clonable, then
there is no reason to assign a new or unique device number. The following is
pretty much the situation:
When opened via "clone" the driver must assign a new and unique device number.
If it does not, yet does not return an error number to qi_open() then the
executive should close the device by calling qi_close() and return ENXIO to
the open(2) call, particularly if filesystem corruption would result down
the road.
When a minor device is marked as cloneable (SAP_CLONE directive to the uw7
SAD driver, or implementation specific d_flags on initialization), then the
driver should be permitted to reuse the same minor device number. This has
one ramification for clonable minors: only one stream head can be accessed
by an explicit open of the minor device. Clonable minors can get around this
my marking the minor clonable and never permitting an explicit open of the
minor device (a new stream head is always created when the minor is opened).
The special file system must of course permit this.
For "clone," because minor devices can subsequently be opened explicitly
without a clone open (DRVOPEN), there should be one stream head per special
file. It is possible, however, to even accomodate this by stacking stream
heads behind the special file and only permitting explicit open of the
topmost stream-head in the stack.
This latter is the approach that I took with Linux Fast-STREAMS. If a
"clone" open returns a device number that already exists, the new stream
head is used and the stream head is "stacked" on the special file. A clone
open (whether via "clone" or a clonable minor) will always result in a new
stream head "stacked" in this fashion. In the case of "clone" there should
be only one stream head in the stack if the documentation rules ("new and
unique minor device number assigned") are followed. An explicit open of the
device number (DRVOPEN) is performed, then the topmost stream head in the
"stack" has its qi_open() procedure called.
This suits my pusposes well and seems consistent with the uw7 documentation.
I have the difficulty that I need to open thousands of streams for
configuration of SS7 protocol stacks. Under LiS, I used to dynamically assign
new major device numbers (sometimes on demand), and used the capability that
LiS allows you to reassign the major device number on open. (In fact LiS
allows you to reassign the major and minor device number on a DRVOPEN, which
made things easier). This is consistent with the "new and unique"
requirement. The strinet driver does this too. Although it has clone minors
(/dev/tcp, /dev/udp etc. are all minor device numbers under the socksys major)
a new unique major/minor device number combination is assigned for each
stream.
One problem with this approach was that I had applications where I needed to
open 10's of thousands of streams, linked under multiplexing drivers and the
like. Almost all these streams are pseudo-devices and few need to be opened
explicitly via major/minor device number. The system could quite quickly run
out of device numbers altogether it the "new and unique" approach is used.
(Linux at one time used to associate a character device with each socket
opened. Failure to scale is why it was dropped.) The clonable minor approach
allows any number of stream heads to be stacked behind the same minor device
number, yet is compatible with the old "clone" method.
It look's like "clone" goes all the way back to Ritchie's Research UNIX, where the original "new and unique" requirement came from.
It is in fact a lot cleaner that if a clone opened driver returns an existing
major/minor device number that the stream head be "stacked" on that special
file rather than anything else. This is because if the driver has returned
zero (0) it has attached (or could have attached) private structures to the
passed in queue pair. This stream head should not be freed without calling
the driver's qi_close() procedure so that it can remove its private
structures from the queue pair. So I suppose the choice is to "stack" the
stream or to call qi_close() and return ENXIO to open(2). I think the only
reason LiS is doing any different right now is because "connld" uses this
corner to redirect the open to the new existing pipe end. I rewrote "connld"
for Linux Fast-STREAMS so that this behavior was no longer necessary ("connld"
more directly installs the stream head for the new pipe end against the file
descriptor.) This was a lot easier (it seems) in SVR 4 because SVR 4 returns
the vnode as a return to c_open calling spec_open. Linux must actually
install dentry and vfsmount against the file pointer.
One more note about stacking stream heads: the file->private_data pointer
always points to the stream head in the stack that belongs to an open file
descriptor. Thus a file pointer only references one stream head during its
lifetime. An inode, on the other hand can reference multiple stream heads.
I wish I could point you to the code for Linux Fast-STREAMS so you could follow what I have done as well, but I'm still packaging for release...
--brian
On Mon, 06 Oct 2003, Dave Grothe wrote:
> I have the old multi-threaded open/close race problem fixed, or at least I
> will claim that it is fixed until someone produces an example that
> demonstrates otherwise.
>
> Before releasing the LiS incorporating these changes I need to solicit your
> opinions on a matter.
>
> When a user process opens a specific stream {maj,min} for a second or
> subsequent open, LiS (and Solaris) calls the driver open routines for all
> the modules in the stream to announce this fact. These open calls are not
> balanced by calls to the driver close routine, since the close routine is
> called only on the LAST close on the stream.
>
> So far, so good.
>
> When a clone open occurs the initial open is to {clone,maj} with the
> driver's open routine being called with {maj,0} and the clone flag
> set. The driver returns an updated {maj,min} to use. If this {maj,min} is
> currently not open then STREAMS simply uses the stream head structure that
> was originally for {clone,maj} to operate {maj,min}.
>
> So far, so good.
>
> But what happens if a clone open returns a {maj,min} that is already
> open? At the moment, LiS does NOT call the open routine of the driver that
> owns {maj,min}, on the theory that the driver open routine was already
> called via the clone driver.
>
> But is that correct? An alternate theory is that a clone open to an
> already-open {maj,min} could be seen as a indirect open to that stream and
> that normal re-open semantics would be to call the open routines.
>
> Does anyone have any direct evidence of what other STREAMS implementations
> do in this case? I don't have a Solaris test case for this.
>
> Thanks,
> Dave
>
>
> _______________________________________________
> Linux-streams mailing list
> [EMAIL PROTECTED]
> http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
-- Brian F. G. Bidulock � The reasonable man adapts himself to the � [EMAIL PROTECTED] � world; the unreasonable one persists in � http://www.openss7.org/ � trying to adapt the world to himself. � � Therefore all progress depends on the � � unreasonable man. -- George Bernard Shaw �
_______________________________________________ Linux-streams mailing list [EMAIL PROTECTED] http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
_______________________________________________ Linux-streams mailing list [EMAIL PROTECTED] http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
_______________________________________________ Linux-streams mailing list [EMAIL PROTECTED] http://gsyc.escet.urjc.es/mailman/listinfo/linux-streams
