subject:"RFD\: Rework\/extending functionality of mdev"

Re: hotplug and modalias handling (WAS: Re: RFD: Rework/extending functionality of mdev)

2015-03-19 Thread James Bowlin

On Wed, Mar 18, 2015 at 02:48 AM, Isaac Dunham said:
 Is this manifested as the root device never shows up?

Yes, although we call it the boot device.

 As the one who probably posted this, I can comment further:
 I've heard of one computer where this was an issue, a couple
 years ago; a number of people on the Puppy Linux forums were
 experimenting with mdev, and one reported that the modalias
 file was missing with a Broadcom wireless card.

That is interesting.  Do you think this could have been due to a
bug in the broadcom driver?   We don't do any networking in our
initrd/initramfs so maybe we can get by with the faster methods
that use modalias files instead of uevent files.  The broken
hotplugging (or whatever the problem is) is a bigger issue for
us now even though it is small compared to the problems caused
by the broken modprobe a year ago.

 However, you may find it worth noting that find /sys will get
 its list of files slightly later than globbing will.

I don't know what the globbing solution is that you refer to.  It
took me a little while to understand the following solution from
the alpinelinux initrd/initramfs, partly because it does not work
here at all:

find /sys -name modalias | xargs sort -u | xargs modprobe -a

It fails badly whenever one or more modalias files has a space in
the path (which is the case here).  That is easily remedied with:

find /sys -name modalias -print0 \
| xargs -0 sort -u \
| xargs modprobe -a -q -b 2/dev/null

This is probably more efficient than what I'm currently doing
if /sys is replaced with /sys/devices.  Thank you.

 Do you wait for the boot device to show up?

Oh yes.  It would not work if we didn't.

 I know the kernel has a rootwait parameter for similar issues
 with usb drives.

We stopped using the related rootdelay years ago. We try to
find the boot device as quickly as possible.  I don't think
rootdelay is applicable unless root= is specified (or something
like that) but even if it were, making users guess how long it's
going to take for the buses to settle (or whatever) seemed overly
crude.  We have a fixed timeout of 10 or 15 seconds.  We keep
trying to find and mount the boot device until the timeout is
reached.  If we can't find it within that time window then we
give up.  As long as the correct modules get loaded, this scheme
works well.  It tends to boot as quickly as possible without
adding unnecessary delays.

 Of course, the root device may be unknown depending on what
 you're doing, which would make waiting properly rather
 difficult.

We have several different modes.  The default mode is to
scan all usb/removable devices and all cdrom/dvd devices,
mounting each in turn, looking for the file(s) we need.
This is repeated until the file is found or the timeout is
reached.  You can also specify the boot device with a label
or a uuid (even a partial uuid) or a device name like sdb1.
You can also specify the type of device: cd (includes dvd), 
usb (includes non-usb removable devices), or hd (internal
drives).

On machines that can't boot directly from usb, the from=usb
boot parameter has proved to be popular.  The boot starts with a
LiveCD but then uses a compatible usb stick as the boot device
which is faster than the LiveCD and makes other features
available such as persistence and remastering.

 (What I'd be inclined to do is check for the root and if it
 fails, coldplug a second time/blindly load sd_mod and
 usb_storage)

My latest solution is to coldplug inside the loop that looks for
the boot device.  When we were experiencing the bug in the
smaller busybox modprobe about a year ago we tried various
schemes of always loading certain modules but that was not very
satisfactory.  It masks the problem instead of fixing it and it.
If loading of hardware specific modules is not 100% reliable then
where do you draw the line of which specific modules to load on
every machine.  Likewise, loading a bunch of modules after a
delay can in some cases just further postpone the eventual
failure.  

ISTM repeated coldplugging is a reasonable compromise even if it
is not as elegant as hotplugging.  At least it only loads modules
that correspond to the hardware.

Thank you for your help.  This discussion has been useful to
me.  

Peace, James
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: hotplug and modalias handling (WAS: Re: RFD: Rework/extending functionality of mdev)

2015-03-19 Thread Isaac Dunham

On Thu, Mar 19, 2015 at 06:02:33PM -0600, James Bowlin wrote:
 On Wed, Mar 18, 2015 at 02:48 AM, Isaac Dunham said:
  Is this manifested as the root device never shows up?
 
 Yes, although we call it the boot device.
 
  As the one who probably posted this, I can comment further:
  I've heard of one computer where this was an issue, a couple
  years ago; a number of people on the Puppy Linux forums were
  experimenting with mdev, and one reported that the modalias
  file was missing with a Broadcom wireless card.
 
 That is interesting.  Do you think this could have been due to a
 bug in the broadcom driver?   We don't do any networking in our
 initrd/initramfs so maybe we can get by with the faster methods
 that use modalias files instead of uevent files.  The broken
 hotplugging (or whatever the problem is) is a bigger issue for
 us now even though it is small compared to the problems caused
 by the broken modprobe a year ago.
 
  However, you may find it worth noting that find /sys will get
  its list of files slightly later than globbing will.
 
 I don't know what the globbing solution is that you refer to.

Sorry, your reference to /sys/devices/ made me semi-remember this
bit (again, from the Puppy Linux forums):

grep -h ^MODALIAS= /sys/bus/*/devices/*/uevent |cut -c 10-

See:
http://www.murga-linux.com/puppy/viewtopic.php?t=78941start=210


 took me a little while to understand the following solution from
 the alpinelinux initrd/initramfs, partly because it does not work
 here at all:
 
 find /sys -name modalias | xargs sort -u | xargs modprobe -a
 
 It fails badly whenever one or more modalias files has a space in
 the path (which is the case here).  That is easily remedied with:
 
 find /sys -name modalias -print0 \
 | xargs -0 sort -u \
 | xargs modprobe -a -q -b 2/dev/null
 
 This is probably more efficient than what I'm currently doing
 if /sys is replaced with /sys/devices.  Thank you.

Ah.

Where SYSBASE is the directory you're searching in, you might prefer to use:
find $SYSBASE -name modalias -exec sort -u '{}' + | xargs modprobe ...

(ie, use find -exec instead of find -0 |xargs -0).

FYI: I find that there are uevents in /sys/bus/ and /sys/module/, 
but no modalias files.

snip
 My latest solution is to coldplug inside the loop that looks for
 the boot device.  When we were experiencing the bug in the
 smaller busybox modprobe about a year ago we tried various
 schemes of always loading certain modules but that was not very
 satisfactory.  It masks the problem instead of fixing it and it.
 If loading of hardware specific modules is not 100% reliable then
 where do you draw the line of which specific modules to load on
 every machine.  Likewise, loading a bunch of modules after a
 delay can in some cases just further postpone the eventual
 failure.  

That sounds like a good course.
 
 ISTM repeated coldplugging is a reasonable compromise even if it
 is not as elegant as hotplugging.  At least it only loads modules
 that correspond to the hardware.
 
 Thank you for your help.  This discussion has been useful to
 me.  

Glad to be of help.


Thanks,
Isaac Dunham

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-19 Thread Didier Kryn



Le 18/03/2015 18:41, Laurent Bercot a écrit :

On 18/03/2015 18:08, Didier Kryn wrote:

No, you must write to the pipe to detect it is broken. And you won't
try to write before you've got an event from the netlink. This event
will be lost.


 I skim over that discussion (because I don't agree with the design) so
I can't make any substantial comments, but here's a nitpick: if you
use an asynchronous event loop, your selector triggers - POLLHUP for
poll(), not sure if it's writability or exception for select()- as
soon as a pipe is broken.


Hi Laurent.
My experience is select() doesn't ever give you anything from 
exception, on Linux.
And fifosvd must close the write end of the pipe; therefore cannot 
poll for writeability.


Didier

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-19 Thread Didier Kryn



Le 18/03/2015 20:01, Harald Becker a écrit :


Do you think it matters losing one more event?


Here we are considering the case when fifosvd is killed (say by 
admin's error). I understand lost events can be recovered. However there 
is one distinctive advantage in detecting immediately the death of 
fifosvd: nldev can die immediately, causing a new working chain to be 
estabished immediately, *before* a possible burst of events. This avoids 
forking 3 daemons just when the next event happens.


The necessary code in nldev consists only in invoking sigaction() 
with a trivial intercept function and testing some flag on return from 
any blocked state (poll/read/write). Note that you probably already want 
this sigaction and intercept to capture SIGTERM.





This is fine as long as the netlink reader keeps control on its exit,
not if it's killed.


And when netlink is killed, the it is the responsibility of the higher
instance to bring required stuff up again.


Sure, we agreed on that. But living orphans should not be left 
behind, and, in this respect, it is nldev which is in charge of fifosvd; 
the higher instance can't do it.





This netlink reader you describe is not the general tool we were
considering up to now, the simple data funnel.


My pseudo code described the principal operation and data flow, not
every glory detail of failure management. So the here described netlink
is what I last called t(netlink the Unix way).


If the idea is to integrate such peculiarities as execing a script,
then it is not the general tool and why not integrate as well the
supervision of mdev-i instead of needing fifosvd. The reason for
fifosvd was AFAIU to associate general tools, nldev and mdev-i.


??? Don't know if I fully understand you here. And why shall exec a 
failure script violate making netlink a general tool? consider:


nldev -e /path/to/failure/script


I must say two things:

First I didn't understand correctly what you had written and didn't 
apreciate the -e option.


Second, I don't know what you include in the failure management, 
but I think part of it should be to get rid of the child.


Doing it is going to be complicated in the script; at least you 
need to pass the pid because it is unknown to the shell.


 Instead, it is pretty simple in nldev: you just need to invoke 
wait() and syslog the exit status. The purpose of wait() isn't to check 
the pid of the process - we know who it is -, it's to remove the zombie, 
and get its exit status. This logic is no harm, whatever the way nldev 
is invoked. Even if it hasn't inherited a child, wait() returns 
immediately. I agree, though, that a comprenensive parsing of the status 
would take some lines of code.




With may be a default of /sbin/nldev-fail.


Maybe with a default behaviour of not execing anything - this 
option must be provided in some way.


I skip the rest of the discussion because I would repeat the same 
things :-) And we agree that fifosvd can know the pipe is broken from 
the return code of the handler, and it's enough to have one way to know it.


Didier


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: hotplug and modalias handling (WAS: Re: RFD: Rework/extending functionality of mdev)

2015-03-18 Thread Natanael Copa

On Tue, 17 Mar 2015 17:51:39 -0600
James Bowlin bit...@gmail.com wrote:

 On Mon, Mar 16, 2015 at 09:55 AM, Natanael Copa said:
  On Fri, 13 Mar 2015 13:12:56 -0600
  James Bowlin bit...@gmail.com wrote:
  
   TL;DR: the current busybox (or my code) seems to be broken on one
   or two systems out of thousands; crucial modules don't get loaded
   using hotplug + coldplug.  Please give me a runtime option so I
   can give my users a boot-time option to try the new approach.
  
...

 My current plan is to make repeated coldplugging the default
 method for loading modules since in every case I've been able to
 investigate, an alias for the missing module(s) is in the output
 of the find command I gave above.  I haven't yet done exhaustive
 testing but every test with repeated coldplugging has worked even
 on the system where the hotplugging is flaky (and which is now
 out of my reach).

Interstingly, this is what we do on alpine linux too:
http://git.alpinelinux.org/cgit/mkinitfs/tree/initramfs-init.in#n530

So apparently we must have had the same problem in the past.

My current plan is to user netlink to serialize the hotplug events and
instead of scanning /sys/devices for modailas entries, I'll trigger
'add' uevents.

(I will also make the netlink listener stop as soon as it has collected
all needed bits to set up the root fs)

 I don't know why hotplugging works most of the time but not all
 the time.

It could be a race condition.

I wonder if the problem persist if you serialize mdev using /dev/mdev.seq


 ISTM as long as I start the hotplugging before I do
 the first (and used to be only) coldplug, there is not a lot I
 can mess up.  Another check is that  with hotplugging disabled
 and only a single coldplug then there are very few (if any?)
 situations where it will boot since it usually takes a few
 seconds for the boot device to show up and that first cold plug
 is done ASAP.
 
 
 Peace, James

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Didier Kryn



Le 17/03/2015 19:56, Harald Becker a écrit :

Hi Didier,

On 17.03.2015 19:00, Didier Kryn wrote:

 The common practice of daemons to put themselves in background and
orphan themself is starting to become disaproved by many designers. I
tend to share this opinion. If such a behaviour is desired, it may well
be done in the script (nohup), and the go to background feature be
completely removed from the daemon proper. The idea behind this change
is to allow for supervisor not being process #1.


Ack, for the case the daemon does not allow to be used with an 
external supervisor.


Invoking a daemon from scripts is no problem, but did you ever come in 
a situation, where you needed to maintain a system by hand? Therefor I 
personally vote for having a simple command doing auto background of 
the daemon, allowing to run from a supervisor, by a simple extra 
parameter (e.g. -n). Which is usually no problem, as the supervisor 
need any kind of configuration, where you should be able to add the 
arguments, the daemon gets started with. So you have to enter that 
parameter just once for your usage from supervisor, but save extra 
parameters for manual invocation.


Long lived daemons should have both startup methods, selectable by a 
parameter, so you make nobodies work more difficult than required.


Dropping the auto background feature, would mean, saving a single 
function call to fork and may be an exit. This will result in a savage 
of roughly around 10 to 40 Byte of the binary (typical x86 32 bit). To 
much cost to allow both usages?


OK, I think you are right, because it is a little more than a fork: 
you want to detach from the controlling terminal and start a new 
session. I agree that it is a pain to do it by hand and it is OK if 
there is a command-line switch to avoid all of it. But there must be 
this switch.






 Could you clarify, please: do you mean implementing in netlink the
logic to restart fifosvd? Previously you described it as just a data
funnel.


No, restart is not required, as netlink dies, when fifosvd dies (or 
later on when the handler dies), the supervisor watching netlink may 
then fire up a new netlink reader (possibly after failure management), 
where this startup is always done through a central startup command 
(e.g. xdev).


The supervisor, never starts up the netlink reader directly, but 
watches the process it starts up for xdev. xdev does it's initial 
action (startup code) then chains (exec) to the netlink reader. This 
may look ugly and unnecessary complicated at the first glance, but is 
a known practical trick to drop some memory resources not needed by 
the long lived daemon, but required by the start up code. For the 
supervisor instance this looks like a single process, it has started 
and it may watch until it exits. So from that view it looks, as if 
netlink has created the pipe and started the fifosvd, but in fact this 
is done by the startup code (difference between flow of operation and 
technical placing the code).


I didn't notice this trick in your description. It is making more 
and more sense :-).


Now look, since nldev (lest's call it by its name) is execed by 
xdev, it remains the parent of fifosvd, and therefore it shall receive 
the SIGCLD if fifosvd dies. This is the best way for nldev to watch 
fifosvd. Otherwise it should wait until it receives an event from the 
netlink and tries to write it to the pipe, hence loosing the event and 
the possible burst following it. nldev must die on SIGCLD (after piping 
available events, though); this is the only supervision logic it must 
implement, but I think it is critical. And it is the same if nldev is 
launched with a long-lived mdev-i without a fifosvd.






 Well, this is what I thought, but the manual says an empty end
causes end-of file, not mentionning the pipe being empty.


end-of-file always include the pipe being empty. Consider a pipe which 
has still some data in it, when the writer closes the write-end. If 
the reader would receive eof before all data has bean consumed, it 
would lose some data. That would be absolutely unreliable. Therefore, 
the eof is only forwarded to the read end, when the pipe is empty.
I agree that the other way wouldn't work. Just noticing the manual 
is wrong/unclear on that point.





*Does anybody know the exact specification of poll behavior on this
case?*

 My experience, with select() which is roughly the same, is that it
does not detect EOF. And, since fifosvd must not read the pipe, how does
it detect that it is broken?


Not detect? Sure you closed all open file descriptors for the write 
end (a common cave-eat)? I have never bean hit by such a case, except 
anyone forgot to close all file descriptors of the write end.
You notice that something happened on input (AFAIR) but I'm sure 
you don't know what. It may be data as well. You must read() to know.


Anyway you don't want to poll() the pipe unless mdev-i is dead

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Harald Becker


On 18.03.2015 10:42, Didier Kryn wrote:

Long lived daemons should have both startup methods, selectable by a
parameter, so you make nobodies work more difficult than required.


 OK, I think you are right, because it is a little more than a fork:
you want to detach from the controlling terminal and start a new
session. I agree that it is a pain to do it by hand and it is OK if
there is a command-line switch to avoid all of it.



But there must be this switch.


Ack!



No, restart is not required, as netlink dies, when fifosvd dies (or
later on when the handler dies), the supervisor watching netlink may
then fire up a new netlink reader (possibly after failure management),
where this startup is always done through a central startup command
(e.g. xdev).

The supervisor, never starts up the netlink reader directly, but
watches the process it starts up for xdev. xdev does it's initial
action (startup code) then chains (exec) to the netlink reader. This
may look ugly and unnecessary complicated at the first glance, but is
a known practical trick to drop some memory resources not needed by
the long lived daemon, but required by the start up code. For the
supervisor instance this looks like a single process, it has started
and it may watch until it exits. So from that view it looks, as if
netlink has created the pipe and started the fifosvd, but in fact this
is done by the startup code (difference between flow of operation and
technical placing the code).


 I didn't notice this trick in your description. It is making more
and more sense :-).


I left it out, to make it not unnecessary complicated, and I wanted to 
focus on the netlink / pipe operation.




 Now look, since nldev (lest's call it by its name) is execed by
xdev, it remains the parent of fifosvd, and therefore it shall receive
the SIGCLD if fifosvd dies. This is the best way for nldev to watch
fifosvd. Otherwise it should wait until it receives an event from the
netlink and tries to write it to the pipe, hence loosing the event and
the possible burst following it. nldev must die on SIGCLD (after piping
available events, though); this is the only supervision logic it must
implement, but I think it is critical. And it is the same if nldev is
launched with a long-lived mdev-i without a fifosvd.


netlink reader (nldev) does not need to explicitly watch the fifosvd by 
SIGCHLD.


Either that piece of code does it's job, or it fails and dies. When 
fifosvd dies, the read end of the pipe is closed (by kernel), except 
there is still a handler process (which shall process remaining events 
from the pipe). As soon as there is neither a fifosvd, nor a handler 
process, the pipe is shut down by the kernel, and nldev get error when 
writing to the pipe, so it knows the other end died.


You won't gain much benefit from watching SIGCHLD and reading the 
process status. It either will give you the information, fifosvd process 
is still running, or it died (failed). The same information you get from 
the write to the pipe, when the read end died, you get EPIPE.


Limiting the time, nldev tries to write to the pipe, would although 
allow to detect stuck operation of fifosvd / handler (won't be given by 
SIGCHLD watching) ... but (in parallel I discussed that with Laurent), 
the question is, how to react, when write to the pipe stuck (but no 
failure)? We can't do much here, and are in trouble either, but Laurent 
gave the argument: The netlink socket also contain a buffer, which may 
hold additional events, so we do not loss them, in case processing 
continues normally. When the kernel buffer fills up to it's limit, let 
the kernel react to the problem.


... otherwise you are right, nldev's job is to detect failure of the 
rest of the chain (that is supervise those), and has to react on this. 
The details of taken actions in this case, need and can be discussed 
(and may be later adapted), without much impact on other operation.


This clearly means, I'm open for suggestions, which kind of failure 
handling shall be done. Every action taken, to improve reaction, which 
is of benefit for the major purpose of the netlink reader, without 
blowing this up needlessly, is of interest (hold in mind: long lived 
daemon, trying to keep it simple and small).


My suggestion is: Let the netlink reader detect relevant errors, and 
exec (not spawn) a script of given name, when there are failures. This 
is small, and gives the invoked script full control on the failure 
management (no fixed functionality in a binary). When done, it can 
either die, letting a higher instance doing the job to restart, or exec 
back and re-start the hotplug system (may be with a different 
mechanism). When the script does not exist, the default action is to 
exit the netlink reader process unsuccessful, giving a higher instance a 
failure indication and the possibility to react on it.




Not detect? Sure you closed all open file descriptors for the write
end (a common cave-eat)?

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Laurent Bercot


On 18/03/2015 18:08, Didier Kryn wrote:

No, you must write to the pipe to detect it is broken. And you won't
try to write before you've got an event from the netlink. This event
will be lost.


 I skim over that discussion (because I don't agree with the design) so
I can't make any substantial comments, but here's a nitpick: if you
use an asynchronous event loop, your selector triggers - POLLHUP for
poll(), not sure if it's writability or exception for select()- as
soon as a pipe is broken.

 Note that events can still be lost, because the pipe can be broken
while you're reading a message from the netlink, before you come
back to the selector; so the message you just read cannot be sent.
But that is a risk you have to take everytime you perform buffered IO,
there's no way around it.

--
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Harald Becker


Hi Laurent !


  Note that events can still be lost, because the pipe can be broken
while you're reading a message from the netlink, before you come
back to the selector; so the message you just read cannot be sent.
But that is a risk you have to take everytime you perform buffered IO,
there's no way around it.


To make clear, about what case we talk, that is:

- spawn conf parser / device operation process
- exit with failure
- re-spawn conf parser / device operation process
- exit with failure
- re-spawn conf parser / device operation process
- exit with failure
- ...
- detect failure loop
- spawn failure script
- exit with failure or not zero status
- giving up, close read end of pipe
- let fifosvd die

@Laurent: What would you do in that case?

Endless respawn? - shrek!

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Harald Becker


Hi Laurent !

 On 18/03/2015 18:08, Didier Kryn wrote:

No, you must write to the pipe to detect it is broken. And you won't
try to write before you've got an event from the netlink. This event
will be lost.


On 18.03.2015 18:41, Laurent Bercot wrote:

  I skim over that discussion (because I don't agree with the design)


Why?

Did you note my last two alternatives, unexpectedly both named #3?
... but specifically the last one Netlink the Unix way?

- uses private pipe for netlink and named pipe for hotplug helper
  (with maximum of code sharing)

- should most likely do the flow of operation as you suggested
  (as far I did understand you)

- except, I split of the pipe watcher / on demand startup code of the 
conf parser / device operation into it's own thread (process), for 
general code usability as a different applet for on demand pipe consumer 
startup purposes

(you had that function as integral part of your netlink reader)

- and I'm currently going to split of that one-shot xdev init feature 
from the xdev, creating an own applet / command for this, as you suggested
(extending functionality for even more general usage, as suggested by 
Isaac, independent from the device management, and maybe modifiable in 
it's operation by changing functions in a shell script)


So why do you still doubt about the design? ... because I moved some 
code into it's own (small) helper thread?




I can't make any substantial comments, but here's a nitpick: if you
use an asynchronous event loop, your selector triggers - POLLHUP for
poll(), not sure if it's writability or exception for select()- as
soon as a pipe is broken.


This is what I expected, but the problem is, the question for this 
arrived, and I can't find the location where this is documented.




  Note that events can still be lost, because the pipe can be broken
while you're reading a message from the netlink, before you come
back to the selector; so the message you just read cannot be sent.
But that is a risk you have to take everytime you perform buffered IO,
there's no way around it.


Ok, what would you then do? Unbuffered I/O on the pipe, and then what?

... if that single one more message dropped, except the others not read 
from netlink buffer (to be lost on close), matters, then we shall indeed 
use unbuffered I/O on the pipe, and only read a message, when there is 
room for one more one more message in the pipe:


  set non blocking I/O on stdout
  establish netlink socket
loop:
  poll for write on stdout possible, until available
  (may set an upper timeout limit, failure on timeout)
  poll for netlink read and still for write on stdout
  if write ability drops
we are in serious trouble, failure
  if netlink read possible
gather message from netlink
write message to stdout (should never block)
on EAGAIN, EINTR: do 3 write retries, then failure

... does that fit better? I don't think that it makes a big difference, 
but I can live with the slight bigger code.


My problem is not the detection of the failing pipe write, but the 
reaction on it. When that happen, the down chain of the pipe most likely 
need more than just a restart. That is, it should only happen on serious 
failure in the conf file or the device operations (- manual action 
required). So I expect more loss of event messages, than just that 
single one message, you were grumbling about. Hence on hotplug restart 
we need to re-trigger the plug events, nevertheless!


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-18 Thread Didier Kryn



Le 18/03/2015 13:34, Harald Becker a écrit :

On 18.03.2015 10:42, Didier Kryn wrote:

Long lived daemons should have both startup methods, selectable by a
parameter, so you make nobodies work more difficult than required.


 OK, I think you are right, because it is a little more than a fork:
you want to detach from the controlling terminal and start a new
session. I agree that it is a pain to do it by hand and it is OK if
there is a command-line switch to avoid all of it.



But there must be this switch.


Ack!



No, restart is not required, as netlink dies, when fifosvd dies (or
later on when the handler dies), the supervisor watching netlink may
then fire up a new netlink reader (possibly after failure management),
where this startup is always done through a central startup command
(e.g. xdev).

The supervisor, never starts up the netlink reader directly, but
watches the process it starts up for xdev. xdev does it's initial
action (startup code) then chains (exec) to the netlink reader. This
may look ugly and unnecessary complicated at the first glance, but is
a known practical trick to drop some memory resources not needed by
the long lived daemon, but required by the start up code. For the
supervisor instance this looks like a single process, it has started
and it may watch until it exits. So from that view it looks, as if
netlink has created the pipe and started the fifosvd, but in fact this
is done by the startup code (difference between flow of operation and
technical placing the code).


 I didn't notice this trick in your description. It is making more
and more sense :-).


I left it out, to make it not unnecessary complicated, and I wanted to 
focus on the netlink / pipe operation.




 Now look, since nldev (lest's call it by its name) is execed by
xdev, it remains the parent of fifosvd, and therefore it shall receive
the SIGCLD if fifosvd dies. This is the best way for nldev to watch
fifosvd. Otherwise it should wait until it receives an event from the
netlink and tries to write it to the pipe, hence loosing the event and
the possible burst following it. nldev must die on SIGCLD (after piping
available events, though); this is the only supervision logic it must
implement, but I think it is critical. And it is the same if nldev is
launched with a long-lived mdev-i without a fifosvd.


netlink reader (nldev) does not need to explicitly watch the fifosvd 
by SIGCHLD.


Either that piece of code does it's job, or it fails and dies. When 
fifosvd dies, the read end of the pipe is closed (by kernel), except 
there is still a handler process (which shall process remaining events 
from the pipe). As soon as there is neither a fifosvd, nor a handler 
process, the pipe is shut down by the kernel, and nldev get error when 
writing to the pipe, so it knows the other end died.


No, you must write to the pipe to detect it is broken. And you 
won't try to write before you've got an event from the netlink. This 
event will be lost.


You won't gain much benefit from watching SIGCHLD and reading the 
process status. It either will give you the information, fifosvd 
process is still running, or it died (failed). The same information 
you get from the write to the pipe, when the read end died, you get EPIPE.


You get the information immediately from SIGCLD. You get it too 
late from the pipe, and you loose at least one event for sure, a whole 
burst if there is.




Limiting the time, nldev tries to write to the pipe, would although 
allow to detect stuck operation of fifosvd / handler (won't be given 
by SIGCHLD watching) ... but (in parallel I discussed that with 
Laurent), the question is, how to react, when write to the pipe stuck 
(but no failure)? We can't do much here, and are in trouble either, 
but Laurent gave the argument: The netlink socket also contain a 
buffer, which may hold additional events, so we do not loss them, in 
case processing continues normally. When the kernel buffer fills up to 
it's limit, let the kernel react to the problem.

Sure, the limit here is pipe size (adjustable) + netlink buffer size.


... otherwise you are right, nldev's job is to detect failure of the 
rest of the chain (that is supervise those), and has to react on this. 
The details of taken actions in this case, need and can be discussed 
(and may be later adapted), without much impact on other operation.


This clearly means, I'm open for suggestions, which kind of failure 
handling shall be done. Every action taken, to improve reaction, which 
is of benefit for the major purpose of the netlink reader, without 
blowing this up needlessly, is of interest (hold in mind: long lived 
daemon, trying to keep it simple and small).


My suggestion is: Let the netlink reader detect relevant errors, and 
exec (not spawn) a script of given name, when there are failures. This 
is small, and gives the invoked script full control on the failure 
management (no fixed functionality in a binary). When

Re: RFD: Rework/extending functionality of mdev

2015-03-17 Thread Harald Becker


Hi Didier,

On 17.03.2015 19:00, Didier Kryn wrote:

 The common practice of daemons to put themselves in background and
orphan themself is starting to become disaproved by many designers. I
tend to share this opinion. If such a behaviour is desired, it may well
be done in the script (nohup), and the go to background feature be
completely removed from the daemon proper. The idea behind this change
is to allow for supervisor not being process #1.


Ack, for the case the daemon does not allow to be used with an external 
supervisor.


Invoking a daemon from scripts is no problem, but did you ever come in a 
situation, where you needed to maintain a system by hand? Therefor I 
personally vote for having a simple command doing auto background of the 
daemon, allowing to run from a supervisor, by a simple extra parameter 
(e.g. -n). Which is usually no problem, as the supervisor need any 
kind of configuration, where you should be able to add the arguments, 
the daemon gets started with. So you have to enter that parameter just 
once for your usage from supervisor, but save extra parameters for 
manual invocation.


Long lived daemons should have both startup methods, selectable by a 
parameter, so you make nobodies work more difficult than required.


Dropping the auto background feature, would mean, saving a single 
function call to fork and may be an exit. This will result in a savage 
of roughly around 10 to 40 Byte of the binary (typical x86 32 bit). To 
much cost to allow both usages?




 Could you clarify, please: do you mean implementing in netlink the
logic to restart fifosvd? Previously you described it as just a data
funnel.


No, restart is not required, as netlink dies, when fifosvd dies (or 
later on when the handler dies), the supervisor watching netlink may 
then fire up a new netlink reader (possibly after failure management), 
where this startup is always done through a central startup command 
(e.g. xdev).


The supervisor, never starts up the netlink reader directly, but watches 
the process it starts up for xdev. xdev does it's initial action 
(startup code) then chains (exec) to the netlink reader. This may look 
ugly and unnecessary complicated at the first glance, but is a known 
practical trick to drop some memory resources not needed by the long 
lived daemon, but required by the start up code. For the supervisor 
instance this looks like a single process, it has started and it may 
watch until it exits. So from that view it looks, as if netlink has 
created the pipe and started the fifosvd, but in fact this is done by 
the startup code (difference between flow of operation and technical 
placing the code).




 Well, this is what I thought, but the manual says an empty end
causes end-of file, not mentionning the pipe being empty.


end-of-file always include the pipe being empty. Consider a pipe which 
has still some data in it, when the writer closes the write-end. If the 
reader would receive eof before all data has bean consumed, it would 
lose some data. That would be absolutely unreliable. Therefore, the eof 
is only forwarded to the read end, when the pipe is empty.



*Does anybody know the exact specification of poll behavior on this
case?*

 My experience, with select() which is roughly the same, is that it
does not detect EOF. And, since fifosvd must not read the pipe, how does
it detect that it is broken?


Not detect? Sure you closed all open file descriptors for the write end 
(a common cave-eat)? I have never bean hit by such a case, except anyone 
forgot to close all file descriptors of the write end.



No, they should still be processed by the handler, which then stumbles
on the eof, when all event messages are read.

 See above. It would make sense but the manual does not tell that. I
bet the manual is wrong, in this case.


It's the working practice of pipes in the Unix world, may be the 
specification of this goes back to KR in the 1970th.




PS. I inadvertently went out of the list. Just my habit to click
reply. I leave it up to you to go back to the list.


I tried to set a CC to the list, but got a response the message has bean 
set to hold, but I try doing it vice versa.


--
Harald


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: hotplug and modalias handling (WAS: Re: RFD: Rework/extending functionality of mdev)

2015-03-17 Thread Isaac Dunham

On Tue, Mar 17, 2015 at 05:51:39PM -0600, James Bowlin wrote:
 On Mon, Mar 16, 2015 at 09:55 AM, Natanael Copa said:
  On Fri, 13 Mar 2015 13:12:56 -0600
  James Bowlin bit...@gmail.com wrote:
  
   TL;DR: the current busybox (or my code) seems to be broken on one
   or two systems out of thousands; crucial modules don't get loaded
   using hotplug + coldplug.  Please give me a runtime option so I
   can give my users a boot-time option to try the new approach.

Is this manifested as the root device never shows up?

  
  Do you use mdev as kernel helper?
 
 Yes.  We specifically use it to load modules associated with newly
 discovered hardware when our live system first boots. This is working
 99+% of the time just fine.
 
  Alpine linux initramfs currently does:
  
find /sys -name modalias | xargs sort -u | xargs modprobe -a
  
  using busybox modprobe.
 
 That expression doesn't look right for several reasons.  About
 six weeks ago I thought a suggestion from the following
 alpinelinux post would solve my problems:
 http://article.gmane.org/gmane.linux.distributions.alpine.devel/2791
 
  Or it could be the kernel creating the uevent but not the
  modalias file--which is a known issue with some hardware; the
  workaround is to use something like this:
 
  find /sys -name uevent -exec grep -h MODALIAS= '{}' + \
 | sort -u | cut -c 10- | xargs modprobe -a 

As the one who probably posted this, I can comment further:
I've heard of one computer where this was an issue, a couple years ago;
a number of people on the Puppy Linux forums were experimenting with
mdev, and one reported that the modalias file was missing with
a Broadcom wireless card.

 But that wasn't the answer.  The find -exec sed code I gave
 previously is a streamlined version of this.  As I said, it made
 no difference.  Also, I was curious if you found that using /sys
 was ever better than using /sys/devices.  It is certainly slower,
 but I haven't seen any systems where scanning /sys/ gives any
 more information than scanning /sys/devices after passing the
 aliases through sort -u.  It didn't make a difference on two

I'm not aware of any cases where /sys/devices was problematic,
and that was another method that was recommended on the Puppy
Linux forums.

However, you may find it worth noting that find /sys will get
its list of files slightly later than globbing will.

 I suspected one system had a problem loading the sd-mod module even
 though the alias for this module showed up in the output of:
 
 find /sys/devices -name modalias -exec cat '{}' +
 
 on their system.  One strange thing that makes me wonder if busybox
snip
  I'm not even sure sd-mod
 was the missing module.  It may well have been the usb-storage
 module which I know was was missing (not loaded) on a 2nd system
 with problems.
 
 In the past few days I've got my hands on a system where
 depending on the hardware (usb device used and usb slot used) the
 usb-storage module sometimes does not get loaded which means the
 boot device does not show up.

Do you wait for the boot device to show up?

I know the kernel has a rootwait parameter for similar issues with
usb drives.
Of course, the root device may be unknown depending on what you're
doing, which would make waiting properly rather difficult.

(What I'd be inclined to do is check for the root and if it fails,
coldplug a second time/blindly load sd_mod and usb_storage)
 
 My current plan is to make repeated coldplugging the default
 method for loading modules since in every case I've been able to
 investigate, an alias for the missing module(s) is in the output
 of the find command I gave above.  I haven't yet done exhaustive
 testing but every test with repeated coldplugging has worked even
 on the system where the hotplugging is flaky (and which is now
 out of my reach).

HTH,
Isaac Dunham
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Didier Kryn



Le 16/03/2015 19:18, Harald Becker a écrit :

On 16.03.2015 10:15, Didier Kryn wrote:


4) netlink reader the Unix way

Why let our netlink reader bother about where he sends the event
messages. Just let him do his netlink receiption job and forward the
messages to stdout.

netlink reader:
   set stdout to non blocking I/O
   establish netlink socket
   wait for event messages
 gather event information
 write message to stdout

hotplug startup code:
   create a private pipe
   spawn netlink reader, redirect stdout to write end of pipe
   spawn fifosvd - xdev parser, redirect stdin from read end of pipe
   close both pipe ends (write end open in netlink, read in fifosvd)




 1) why not let fifosvd act as the startup code? It is anyway the
supervisor of processes at both ends of the pipe and in charge of
re-spawning them in case they die. Netlink receiver should be restarted
immediately to not miss events, while event handler should be restarted
on event (see comment below).


This would make the fifosvd specific to the netlink / hotplug 
function. My intention is, to get a general usable tool.


I had not caught the point that you wanted it a general purpose 
tool - sorry.


You won't gain something otherwise, as the startup of the daemon has 
to be done nevertheless. It does not matter if you start fifosvd, and 
then it forks again bringing it into background, and then fork again 
to start the netlink part, or do it slight different, start an inital 
code snipped, that do the pipe creation and the forks (starting the 
daemons in background), then step away. This is same operation only 
moved a bit around, but may be not blocking other usages.


Sure it is the same. My point was about supervision.


The netlink reader is a long lived daemon. It shall not exit, and 
handle failures internally where possible, but if it fails, pure 
restarting without intervening other action to control / correct the 
failure reason, doesn't look as a good choice. So it needs any higher 
instance to handle this, normally init or a different system 
supervisor program (e.g. inittab respawn action).


OK, then this higher instance cannot be an ordinary supervisor, 
because it must watch two intimely related processes and re-spawn both 
if one of them dies. Hence, it is yet another application. This is why I 
thought fifosvd was a good candidate to do that. Also because it already 
contains some supervision logic to manage the operation handler.


So, if fifosvd is a general usable tool, it must come with a 
companion general usable tool, let's call it fifosvdsvd, designed to 
monitor pairs of pipe-connected daemons.


Where as the device operation handler (including conf parser) is 
started on demand, when incoming events require this. The job of the 
fifosvd is this on demand pipe handling, including failure management.




 2) fifosvd would never close any end of the pipe because it could
need them to re-spawn any of the other processes. Like this, no need for
a named pipe as long as fifosvd lives.


Dit you look at my pseudo code? It does *not* use a named pipe (fifo) 
for netlink operation, but a normal private pipe (so pipesvd may fit 
better it's purpose). Where as hotplug helper mechanism won't work 
this way, and require a named pipe (different setup, by just doing 
slight different startup).


Yes, but it cannot work if the two long-lived daemons are 
supervised by an ordinary supervisor. Because one end of the pipe is 
lost if one of the processes die, and this kind of supervisor will 
restart only the one which died.




 And I have a suggestion for simplicity: Let be the
timeout/no-timeout feature be a parameter only for the event handler; it
does not need to change the behaviour of fifosvd. I think it is fine to
restart event handler on-event even when it dies unexpectedly.


???


At some point you considered that the operation handler might be 
either long-lived or dieing on timeout. I suggest that the supervision 
logic is identical in the two cases.


Didier



___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Natanael Copa

On Mon, 16 Mar 2015 18:45:26 +0100
Harald Becker ra...@gmx.de wrote:

 On 16.03.2015 09:19, Natanael Copa wrote:
  I am only aware of reading kernel events from netlink or kernel hotplug
  helper.
 
 Where as I'm trying to create a modular system, which allows *you* to 
 setup a netlink usage, *the next* to setup hotplug helper usage (still 
 with speed improvement, not old behavior), and ...
 
  What is this new, third plug mechanism? I think that is the piece I am
  missing to understand why fifo manager approach would be superior.
 
 ... the *ability* to setup a system, with a different plug mechanism, 
 not yet mentioned, using same modular system. Just putting together the 
 functional blocks, the system maintainer decides.

Does this not yet mentioned plug mechanism exist?
 
 Think of, for simplicity, about doing the event gathering from sys file 
 system with some shell script code, then forward the device event 
 message to rest of system. Looks ugly?

Looks ugly, yes.

 What about older or small systems without hotplug feature?

Does it exist systems that are so old that they lack hotplug - but at
same time are new enough to have sysfs?

I suppose it would make sense for kernels without CONFIG_HOTPLUG but I
would expect such systems use highly customized/specialized tools
rather than general purpose tools.

 My intention is *not* to *solve your needs*, it is to give *you* the 
 *tools to build* the system with *your intended functionality*, by 
 putting together some commands or command parameters, without writing 
 code (programs). At the only expense of some (possibly) dead code in the 
 binary. Where dead code means, dead for you, but be used by others who 
 want to setup there system in a different way (build your own BB version 
 and opt out, if you dislike).

We have different goals so I will likely not use your tools. I want a
tool for hotplug that avoids dead code.

Thanks for your patience and thanks for describing it with few words. I
think I finally got it.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Harald Becker


On 16.03.2015 09:19, Natanael Copa wrote:

I am only aware of reading kernel events from netlink or kernel hotplug
helper.


Where as I'm trying to create a modular system, which allows *you* to 
setup a netlink usage, *the next* to setup hotplug helper usage (still 
with speed improvement, not old behavior), and ...



What is this new, third plug mechanism? I think that is the piece I am
missing to understand why fifo manager approach would be superior.


... the *ability* to setup a system, with a different plug mechanism, 
not yet mentioned, using same modular system. Just putting together the 
functional blocks, the system maintainer decides.


Think of, for simplicity, about doing the event gathering from sys file 
system with some shell script code, then forward the device event 
message to rest of system. Looks ugly? What about older or small systems 
without hotplug feature?


My intention is *not* to *solve your needs*, it is to give *you* the 
*tools to build* the system with *your intended functionality*, by 
putting together some commands or command parameters, without writing 
code (programs). At the only expense of some (possibly) dead code in the 
binary. Where dead code means, dead for you, but be used by others who 
want to setup there system in a different way (build your own BB version 
and opt out, if you dislike).


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Harald Becker


On 16.03.2015 21:30, Natanael Copa wrote:

Does it exist systems that are so old that they lack hotplug - but at
same time are new enough to have sysfs?


Oh, yes! Mainly embedded world.


I suppose it would make sense for kernels without CONFIG_HOTPLUG but I
would expect such systems use highly customized/specialized tools
rather than general purpose tools.


You miss to think about all those embedded device which build there 
minimal System around BB, plus some application specific programs.


Such systems do only use BB tools for the system setup, and what if they 
want to use a specialized plug event gatherer (however that does work)?



We have different goals so I will likely not use your tools. I want a
tool for hotplug that avoids dead code.


Then build your own BB version and disable those mechanism in the config 
you do not use. Then there would be no dead code ... but you are free to 
use whichever tool you like.




Thanks for your patience and thanks for describing it with few words. I
think I finally got it.


Your decision, but I still think, you don't do something different, 
except moving around some of the code, without gaining any real benefit, 
but the expense of blocking those who like to setup there system in a 
different way ... ough ... I won't like to use your tools, too!


[For completeness: When you like to know, why I think your design is not 
the right way, see what Laurent told you about this.]


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Harald Becker


On 16.03.2015 22:25, Didier Kryn wrote:

 I had not caught the point that you wanted it a general purpose
tool - sorry.


It's a lengthy and complex discussion, not very difficult to miss 
something, so no trouble. Please ask, when there are questions.




The netlink reader is a long lived daemon. It shall not exit, and
handle failures internally where possible, but if it fails, pure
restarting without intervening other action to control / correct the
failure reason, doesn't look as a good choice. So it needs any higher
instance to handle this, normally init or a different system
supervisor program (e.g. inittab respawn action).


 OK, then this higher instance cannot be an ordinary supervisor,
because it must watch two intimely related processes and re-spawn both
if one of them dies. Hence, it is yet another application. This is why I
thought fifosvd was a good candidate to do that. Also because it already
contains some supervision logic to manage the operation handler.


Supervision is a system depended function, which differs on the 
philosophy the init process is working and handles such things. So 
before we are talking about which supervision, we need to tell, which 
type of supervision you are using, that is mainly which init system you use.




 So, if fifosvd is a general usable tool, it must come with a
companion general usable tool, let's call it fifosvdsvd, designed to
monitor pairs of pipe-connected daemons.


A pipe is an unidirectional thing. Writing a program, sitting at the 
read end of a pipe, to watch the other side is logical mixing of 
functions, but ...



Where as the device operation handler (including conf parser) is
started on demand, when incoming events require this. The job of the
fifosvd is this on demand pipe handling, including failure management.



 2) fifosvd would never close any end of the pipe because it could
need them to re-spawn any of the other processes. Like this, no need for
a named pipe as long as fifosvd lives.


Dit you look at my pseudo code? It does *not* use a named pipe (fifo)
for netlink operation, but a normal private pipe (so pipesvd may fit
better it's purpose). Where as hotplug helper mechanism won't work
this way, and require a named pipe (different setup, by just doing
slight different startup).


 Yes, but it cannot work if the two long-lived daemons are
supervised by an ordinary supervisor. Because one end of the pipe is
lost if one of the processes die, and this kind of supervisor will
restart only the one which died.


... you are wrong. When the netlink process dies, the write end of the 
pipe is automatically closed by the kernel. This let at first the 
handler process detect, end-of-file when waiting for more messages, so 
that process exits. fifosvd then checks the pipes and gets an error, 
telling the pipe has shutdown on the write end, so fifosvd does the 
expected thing, it exits too.


Even if that exit may be delayed somewhat, it does not matter, when the 
higher instance respawns the hotplug system due to the netlink exit. The 
new pipe will be established in parallel, while the old pipe (including 
processes) vanish after some small amount of time.


That is your supervision chain is slight different:

- netlink is supervised by the higher instance, but itself watches for 
failures on the pipe (in case the read end dies unexpectedly)


- the supervision of the pipe read side is a bit complexer, as we use an 
on demand handler process, so we have two different cases: a handler 
process is currently active or not:


- when no handler process is active, fifosvd detects a pipe failure of 
the write end immediately and just exit. So there is no need of 
supervision, only some resources have to be freed


- when there is an active handler process, this process is supervised by 
fifosvd, but itself checks for eof on the pipe, and exit. Meanwhile 
waits fifosvd for the exit of the handler process and checks the exit 
status (if successfull or any failure). Nevertheless which way fifosvd 
takes, at the end it detects, the write end of the pipe has gone and 
takes his hat.


so supervision chain is:

init - netlink - fifosvd - handler


 At some point you considered that the operation handler might be
either long-lived or dieing on timeout. I suggest that the supervision
logic is identical in the two cases.


That was an alternative in the discussion, to show how I got to my 
solution and picked up a solution Laurent mentioned. So the 
alternatives, show the steps of improving my approach has gone.


I highly prefer this last one (netlink reader the Unix way). It is the 
version with the most flexibility, and even addresses the wish, to use a 
private pipe and not a named pipe for netlink operation, without adding 
extra overhead for that possibility.


Indeed are the alternatives very similar, do the same principal 
operation, but move around some code a bit, for different purposes, to 
see which impact each

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Natanael Copa

On Sun, 15 Mar 2015 01:45:05 +0100
Harald Becker ra...@gmx.de wrote:

 On 14.03.2015 03:40, Laurent Bercot wrote:

Please take a look at my s6-uevent-listener/s6-uevent-spawner, or at
  Natanael's nldev/nldev-handler. The long-lived uevent handler idea is
  a *solved problem*.
 
 I know how that works, and this is the problem. I see limitations of 
 this approach, which I try to overcome.
 
 1) using as netlink mechanism only - no problem
 
 2) using with kernel hotplug helper mechanism - fails to use, or still 
 suffers from re-parsing conf for each event.
 
 3) Open up my mind and accept that next one coming around, may have a 
 brand new plug mechanism in his bag - may be difficult to do without 
 changing code.

I am only aware of reading kernel events from netlink or kernel hotplug
helper.

What is this new, third plug mechanism? I think that is the piece I am
missing to understand why fifo manager approach would be superior.

(feel free to point me to a link in the mailing list archive in case
you already wrote about it and it drowned in the amount of words)

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

hotplug and modalias handling (WAS: Re: RFD: Rework/extending functionality of mdev)

2015-03-16 Thread Natanael Copa

On Fri, 13 Mar 2015 13:12:56 -0600
James Bowlin bit...@gmail.com wrote:

 TL;DR: the current busybox (or my code) seems to be broken on one
 or two systems out of thousands; crucial modules don't get loaded
 using hotplug + coldplug.  Please give me a runtime option so I
 can give my users a boot-time option to try the new approach.

Do you use mdev as kernel helper?
 
 I have a few users (out of many thousands) who have trouble
 loading modules they need using the conventional busybox tools.
 I enable hotplugging and then coldplug, loading everything from:
 
/sys/devices -name modalias -exec cat '{}' + 2/dev/null
 
 I've also used:
 
 find /sys/devices -name uevent -exec sed -n \
 's/^MODALIAS=//p' '{}' + 2/dev/null \
 
 But after filtering through sort -u the outputs of the find
 commands are identical.  So I have a failsafe fallback that
 loads all modules from: /lib/modules/$(uname -r)

Alpine linux initramfs currently does:

  find /sys -name modalias | xargs sort -u | xargs modprobe -a

using busybox modprobe.

Since loading the modules might trigger new MODALIAS events you need
your hotplugger to handle those. You can use this line in mdev.conf:

  $MODALIAS=.*root:root   0660@modprobe -b $MODALIAS


However, what I want to do is fix the hotplug handler to be so fast
that I can scan /sys for uevent and trigger the hotplug event.

This means that the hotplug event handler needs to be able to load
kernel modules without (too much) forking modprobe. I am considering
libkmod as an alternative but I would prefer that it could be solved
with busybox only, however, when looking at the busybox modalias code,
it looks like it will require intrusive changes to make it handle
modaliases from a stream.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-16 Thread Didier Kryn



Le 16/03/2015 00:58, Harald Becker a écrit :

We were looking at alternative solutions, so even one more:

3) netlink reader the Unix way

Why let our netlink reader bother about where he sends the event 
messages. Just let him do his netlink receiption job and forward the 
messages to stdout.


netlink reader:
   set stdout to non blocking I/O
   establish netlink socket
   wait for event messages
 gather event information
 write message to stdout

hotplug startup code:
   create a private pipe
   spawn netlink reader, redirect stdout to write end of pipe
   spawn fifosvd - xdev parser, redirect stdin from read end of pipe
   close both pipe ends (write end open in netlink, read in fifosvd)


The general scheme makes sense to me, but I would chane two details:

1) why not let fifosvd act as the startup code? It is anyway the 
supervisor of processes at both ends of the pipe and in charge of 
re-spawning them in case they die. Netlink receiver should be restarted 
immediately to not miss events, while event handler should be restarted 
on event (see comment below).


2) fifosvd would never close any end of the pipe because it could 
need them to re-spawn any of the other processes. Like this, no need for 
a named pipe as long as fifosvd lives.


And I have a suggestion for simplicity: Let be the 
timeout/no-timeout feature be a parameter only for the event handler; it 
does not need to change the behaviour of fifosvd. I think it is fine to 
restart event handler on-event even when it dies unexpectedly.


Didier


This way we can let the starting process decide which type of pipe we 
use: private pipe for netlink, and named pipe for hotplug helper.


I think this is not far away from Laurent's (or Natanael's) solution, 
at the only cost of a small long lived helper process, managing the on 
demand handler startup and checking for failures. Small general 
purpose daemon in the sense of supervisor daemons (e.g. tcpsvd), with 
generally reusable function for other purposes.


... better?

Ok, but brings me to the message format in the pipe, I highly think, 
we should use a textual format, but do required checks for control 
chars and do some (shell compatible) quoting.


This would allow to do:

  netlink reader /dev/ttyX
  (to display all device plug events on a console)

  netlink reader /tmp/uevent.log
  (append all event message to log file)

  ... and all such things.

I know, the parser needs to do some checking and unquoting, but we 
have a single reader and it doesn't matter how much data it reads from 
the pipe in a single hunck, as long as the writers assure, they are 
going to write a single message with one write (atomicity). The parser 
assumes reading text from stdin, with required checking and unquoting. 
This way we get maximum compatibility and may easily replace every 
part with some kind of script.


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox



___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread James Bowlin

On Sun, Mar 15, 2015 at 08:06 PM, Laurent Bercot said:
  kernel guaranties not only atomicity for write operations, but also
  for read operations (not in POSIX, AFAIK). 
  
   Second sentence of
   http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html :
^
For goodness sake.  This appears to be an argument merely for the
sake of having an argument.

 If, say, Linux has documentation somewhere explaining what it
 does with multiple readers on a pipe, [...]

A simple Internet search brings up the Linux fifo man page:
http://man7.org/linux/man-pages/man7/fifo.7.html

A FIFO special file (a named pipe) is similar to a pipe,
except that it is accessed as part of the filesystem.  It can
be opened by multiple processes for reading or writing.

 [] and committing to NOT changing that behaviour EVER,

IMO this is a ridiculous demand that can't ever be met by
any software.

For my users the current busybox mdev hotplugging is not 100%
reliable (more like 99+% reliable) which is a big pain.  I'd love
to see other busybox hotplug solutions that are selectable at
runtime.

You both have a lot of value to contribute.  If both/all
solutions can be (via compile-time options) available at runtime
and one is clearly superior to the others then that will be
quickly figured out.

It is very kind to try to help someone else from wasting time
on a technically inferior solution but at some point it is
better for everyone to just let them go ahead and use their
time as they see fit.


Peace, James
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker


On 14.03.2015 03:40, Laurent Bercot wrote:

What would you do if your kid wanted to drive a car but said he
didn't like steering wheels, would you build him a car with a
joystick ?



... [base car with wheel steering module replaceable by a joystick
module] ...


... and the next one coming around, with an automatic steering module, 
may replace the wheel steering or joystick module, plug in his module, 
and also take advantage of your base car ...


--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker


On 14.03.2015 03:40, Laurent Bercot wrote:

  - for reading: having several readers on the same pipe is the land
of undefined behaviour. You definitely don't want that.


... just for the curiosity:

On most systems it is perfectly possible to have multiple readers on a 
pipe, when all readers and writers be so polite to use the same message 
size (= PIPE_BUF). On most (but not all Unix systems) the kernel 
guaranties not only atomicity for write operations, but also for read 
operations (not in POSIX, AFAIK).


With multiple readers you will get a load balancing. The first available 
reader get the next message from the pipe and has to handle this 
message. You can't predict, which process will receive a specific 
message, as this every process has to handle all types of incoming 
messages. This is usually true, when all readers share the same program.


With a small helper you can even fire up a new reader process when there 
is more data in the pipe, then sleep some time to let the new process 
pick up a message from the pipe, and then check the pipe again for more 
data, firing up the next reader (up to an upper limit of reader 
processes). Reader processes just die, when no more data is available in 
the pipe.


This belongs to pipes, and it does not matter if they are private or 
named (fifo).


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Laurent Bercot


On 15/03/2015 19:39, Harald Becker wrote:


On most systems it is perfectly possible to have multiple readers on
a pipe, when all readers and writers be so polite to use the same
message size (= PIPE_BUF). On most (but not all Unix systems) the
kernel guaranties not only atomicity for write operations, but also
for read operations (not in POSIX, AFAIK).


 Second sentence of
 http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html :

The behavior of multiple concurrent reads on the same pipe, FIFO,
or terminal device is unspecified.

 What most systems do in practice is irrelevant. There is no guarantee
at all on what a system will do when you have multiple readers on the
same pipe; so it's a bad idea, and I don't see that there's any room for
discussion here.

 If, say, Linux has documentation somewhere explaining what it does with
multiple readers on a pipe, and committing to NOT changing that behaviour
EVER, then it might be reasonable for Linux-specific software to rely on
it. I'm not aware of such a piece of documentation though.

--
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker


On 15.03.2015 20:06, Laurent Bercot wrote:

  What most systems do in practice is irrelevant. There is no guarantee
at all on what a system will do when you have multiple readers on the
same pipe; so it's a bad idea, and I don't see that there's any room for
discussion here.


My lead in was: just for curiosity, and that's it, it works on many 
systems.


... but I never proposed, doing something like that. It's what my lead 
in says: curiosity.


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Laurent Bercot


On 15/03/2015 20:41, James Bowlin wrote:

   http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html :

 ^
For goodness sake.  This appears to be an argument merely for the
sake of having an argument.


 This is POSIX.1-2008, the very specification that Linux, and other
operating systems, are supposed to implement. It is the authoritative
reference to follow whenever you're designing Unix software. I don't
understand what your objection is.



 A FIFO special file (a named pipe) is similar to a pipe,
 except that it is accessed as part of the filesystem.  It can
 be opened by multiple processes for reading or writing.


 Yes, I know the Linux man pages, and of course multiple processes
are allowed to open a pipe for reading. But there is nothing in that
page that documents what Linux does when multiple processes actually
attempt to read  on the same pipe.



[] and committing to NOT changing that behaviour EVER,

IMO this is a ridiculous demand that can't ever be met by
any software.


 This is called specification and normalization, i.e. what standards
are for. Sure, standards change and evolve, and that's a good thing;
my point is when something is explicitly non-standardized, it is not
a good idea to do that thing and expect a fixed behaviour. There
really is no room for disagreement here.



For my users the current busybox mdev hotplugging is not 100%
reliable (more like 99+% reliable) which is a big pain.  I'd love
to see other busybox hotplug solutions that are selectable at
runtime.


 So would I, and the solution you're looking for is called netlink +
mdev -i, which is all that remains to be implemented. I would really
like to cut on the bikeshedding and see some real work done now; if
nothing has appeared when I get some time, I'll do it myself - which
so far seems the only way to get things done, and which will help you
much more than buggy solutions in search of a problem.



It is very kind to try to help someone else from wasting time
on a technically inferior solution but at some point it is
better for everyone to just let them go ahead and use their
time as they see fit.


 Multiple readers on a pipe is not technically inferior, it is
technically *invalid*. I'm not preventing anyone from coding anything,
but I will fight inclusion of buggy code into busybox, which is a
major disservice to do to you and your users.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker


On 15.03.2015 20:06, Laurent Bercot wrote:

The behavior of multiple concurrent reads on the same pipe, FIFO,
or terminal device is unspecified.


That is, you can't predict, which process will get the data, but each 
single read operation on a pipe (private or named), is done atomic. 
Either it is done complete and read the requested number of bytes, or it 
is not done at all. It won't read half of the data, then let some data 
pass to a different process, then continue with the read in the first 
process (or let that one do short read, when there is enough data).


I don't want to introduce this or use it, but please stop and think 
about: When each writer and each reader agree at the size of messages 
written / read from the pipe, you can have multiple writers *and* 
multiple readers on the same pipe. Due to atomicity of the write and 
read operations.


... and I know, it's not in POSIX / OpenGroup ... it's just working 
practice ... try it and you will see, it works.


... for curiosity :)  (And don't fear, I won't do this)

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Laurent Bercot


On 15/03/2015 23:44, Harald Becker wrote:

There are to many stupid programmers out there, who would try to add
something like that into system management related programs. Couldn't
go worser. Even if it works, at the first glance, it is error prone,
and the next who change message size of one process, will break the
chain, and possibly smash the system (as running with root
privileges).

Laurent just overlooked my lead in just for curiosity. And (again)
that's it: curiosity.

... at least until reactions have bean standardized, clearly
documented (not same as former), and guarantied.


 Yes.
 Thank you for acknowledging it and for alleviating my fear.

--
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Laurent Bercot


On 15/03/2015 21:54, Harald Becker wrote:

My lead in was: just for curiosity, and that's it, it works on many systems.
... but I never proposed, doing something like that. It's what my lead in says: 
curiosity.


 Okay, fair enough. Please let's not do it then. :)

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker


Hi James!

 1) Why argue over something that has already been admitted?

It does not bolster your argument and it does not put in you
a good light.


Keep cool, I know the fears of Laurent.

There are to many stupid programmers out there, who would try to add 
something like that into system management related programs. Couldn't go 
worser. Even if it works, at the first glance, it is error prone, and 
the next who change message size of one process, will break the chain, 
and possibly smash the system (as running with root privileges).


Laurent just overlooked my lead in just for curiosity. And (again) 
that's it: curiosity.


... at least until reactions have bean standardized, clearly documented 
(not same as former), and guarantied.


--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread Harald Becker

So, as I'm not in write-only mode, here some possible alternatives, we 
could do (may be it shows better, how and why I got to my approach):



1) netlink with a private pipe to long lived handler:

  establish netlink socket
  spawn a pipe to handler process
  write a netlink, no timeout message to pipe
  wait for event messages
 gather event information
 write message to pipe

The initial pipe message let the parser / handler know that we are at 
netlink operation and disable timeout functionality, resulting in both 
processes being long lived. This won't harm the system much, as memory 
of sleeping processes is usually swapped out, but still resources lie 
around unused.


@Laurent: You know the race conditions why the handler process needs to 
be long lived here, or we need complex pipe management with re-spawning 
handler, and all that stuff. You told about them.


This would indeed be the simplest solution when splitting of netlink 
reader and handler. Other mechanisms may still create a named pipe and 
use the same handler for it's purpose. With the cave-eat of two long 
lived processes, where I call one big.


So, look forward for second alternative ...


2) netlink with a private pipe but on demand start of handler (avoiding 
the race):


   create a pipe and hold both ends open (but never read)
   establish netlink socket
   wait for event message
  gather event information
  if no handler process running
 spawn a new handler process, redirecting stdin from read end 
of pipe

 write message to pipe

  with a SIGCHLD handling of:
 get status of process
 do failure management
 check for data still pending in pipe
re-spawn a handler process, redirecting stdin from read end of pipe

The netlink reader is a long lived process, the handler is started on 
demand when required and may die after some timeout. Races won't happen 
this way, as the pipe does not vanish and data written into the pipe 
during exit of an old handler, does not get lost (next handler will get 
the message).


... better?

This is, was I want to do, with an additional choice of more clarity: 
Let the netlink reader do it's job, and split of the pipe management and 
handler start into a separate thread, but otherwise exactly the same 
operation. With *no* extra cost, the pipe management and the handler 
startup may then be used for other mechanism(s).


... still afraid about using a named pipe? You still would prefer a 
private pipe for netlink?


... ok look at the next alternative (and on this one I came taking your 
fears into account).



3) netlink spawning external supervisor for on demand handler startup

netlink reader:
   establish netlink socket
create a pipe, save write end for writing to pipe
   spawn fifosvd - xdev parser, redirecting stdin from read end of pipe
   close read end of pipe
  wait for event messages
 gather event information
 write message to pipe

fifosvd:
   save and hold read end of pipe open (but never read)
   wait until data arrive in pipe (poll for read)
   spawn handler process, handing over the pipe read end to stdin
   wait for exit of process
   failure management

A novice may think this way we added another process in the data flow, 
but no, the data flow is still the same: netlink - pipe - handler.  
The extra process is a small helper, containing the code for the on 
demand start of the handler,  and the failure management, but will never 
get in contact with the data passed through the pipe.



This approach allows simple reusing of code for other mechanism(s), and 
fifosvd may be of general usage: when argument is a single dash (-), 
it uses the pipe from stdin else it creates and opens a named pipe. May 
also be used for on demand start of other jobs:


   process producing sometimes data | script to process the data

 may be changed to:

   process producing data | fifosvd - script to process data

will script start on demand when data arrives in the pipe, and when 
script dies, restart as soon as more data is in the pipe.


This is extra benefit from my approach, with no extra cost.


I hope this helps to solve some fears.

--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-15 Thread James Bowlin

On Sun, Mar 15, 2015 at 10:10 PM, Laurent Bercot said:
   This is POSIX.1-2008, the very specification that Linux, and other
 operating systems, are supposed to implement. It is the authoritative
 reference to follow whenever you're designing Unix software. I don't
 understand what your objection is.

I tried to make my complaint clear by highlighting two portions of
text, one by Harald and one by you.  I apologize that my post was
unclear to you. You omitted one part of text I had highlighted so
I repeat it here:

On Sun, Mar 15, 2015 at 01:41 PM, James Bowlin said:
   kernel guaranties not only atomicity for write operations, but
   also for read operations (not in POSIX, AFAIK).   
  

The post you were replying to already admitted that multiple fifo
readers was not POSIX compliant.

1) Why argue over something that has already been admitted?
   It does not bolster your argument and it does not put in you
   a good light.

2) While violating POSIX is usually not a good idea, it is well
   known that POSIX is woefully incomplete and there are
   long-standing extensions to POSIX on all real-world system
   including busybox.  I propose an extension to Godwin's law
   that if someone objects to something because it relies on a
   long-standing extension to a POSIX standard then that person
   automatically loses the argument.

I think your suggestions have been very valuable and your
proposals may well represent a superior solution technically.
I am a little frustrated by the silly arguments because I am
so looking forward to seeing the fruition of your ideas.


Peace, James
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-14 Thread Harald Becker


On 14.03.2015 03:40, Laurent Bercot wrote:

  Hm, after checking, you're right: the guarantee of atomicity
applies with nonblocking IO too, i.e. there are no short writes.
Which is a good thing, as long as you know that no message will
exceed PIPE_BUF - and that is, for now, the case with uevents, but
I still don't like to rely on it.


Named pipes are a proven IPC concept, not only in the Unix world. They 
are pipes and behave exactly as them, including non blocking I/O, 
programming the poll loop, and failure handling. There is only one 
difference, the method how to get access to the pipe file descriptors 
(either calling pipe or open).




  I call pipe an anonymous pipe. And an anonymous pipe, created by
the netlink listener when it forks the event handler, is clearly the
right solution, because it is private to those two processes. With
a fifo, aka named pipe, any process with the appropriate file system
access may connect to the pipe, and that is a problem:


Right, any process with root access may write to this pipe, but don't 
you think such processes have the ability to do other need things, like 
changing the device node entries in the device file system directly?


May processes with root access produce confusion on the pipe?

Yes, but aren't such processes be able to produce any kind of confusion 
they like?


We could have (at some slight extra cost):

- create the fifo with devparser:pipegroup 0620

- run hotplug helper (if used) suid:sgid to hotplug:pipegroup
  (or drop privileges to that)

- drop netlink reader after socket creation to same user:group

- run the fifo supervisor as devparser:parsergroup

- but then we need to run the parser as suid root
Needs to access device file system and do some operation which require 
root (as far as I remember). Any suggestion how to avoid that suid root?




  - for writing: why would you need another process to write into the
pipe ? You have *one* authoritative source of information, which is
the netlink listener, any other source is noise, or worse, malevolent.


You stuck on netlink usage and oversee you are forcing others to do it 
your way. No doubt about the reasons for using netlink, but why forcing 
those who dislike? This forcing won't be different then forcing others 
to rely on e.g. systemd? Isn't it? (provocation, don't expected to be 
answered)


Where as I'm trying to give the user (or say system maintainer) the 
ability to chose the mechanism he likes, and even with the chance to 
flip the mechanism, by just modifying one or two parameters or commands. 
Flipping the mechanism is even possible in a running system without 
disturbance, and without changing configuration.


So why is this approach worser than forcing others to do things in a 
specific way? Except those known arguments why netlink is the better 
solution, where we absolutely agree.




  - for reading: having several readers on the same pipe is the land
of undefined behaviour. You definitely don't want that.


Is here anyone trying to have more than one reader on the pipe? The 
only one reader of the pipe is the parser, and right as we are using 
fifos the parser shouldn't bet on incoming message format and content. 
It shall do sanity checks on those before usage (and here we hit the 
point, where I expect getting some overhead, not much due to other 
changes). Isn't that good practice to do this for other pipes too (even 
if a bit more for paranoia)? But all with the benefit of avoiding 
re-parsing the conf for every incoming event, and expected over all 
speed improvement. Not to talk about the possibility to chose/flip the 
mechanism as the user likes.


This even includes extra possibilities for e.g. debugging and watching 
purposes. With a simple redirection of the pipe you may add event 
logging functionality and/or live display of all event messages 
(possibly filtered by a formating script / program). All without extra 
cost / impact for normal usage, and without creating special debug 
versions of the event handler system.


I'm just trying to make it modular, not monolithic.



  - generally speaking, fifos are tricky beasts, with weird and
unclear semantics around what happens when the last writer closes,
different behaviours wrt kernel and even *kernel version*, and more
than their fair share of bugs. Programming a poll() loop around a
fifo is a lot more complicated, counter-intuitive, and brittle, than
it should be (whereas anonymous pipes are easy as pie. Mmm... pie.)


See my statement about fifos above, I don't know what you fear about 
fifos, but there usage and functionality is more proven in the Unix 
world, as you expect. Sure you need to watch your steps, but this shall 
also be done when using pipes (even if only for paranoia, e.g. checking 
incoming data before usage and not blind reliance).


And may be there are internal differences on pipe / fifo handling in the 
kernels, but likely they are internal and don't change the expected 
usage

RE: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Cathey, Jim

Stream-writes [pipe] are not atomic, and your message can theoretically get
cut in half and interleaved with another process writing the same fifo.

Any pipe, whether named or not, IS atomic so long as the datagrams
in question are smaller than PIPE_BUF in size.  This has been true
since Day 1, in every Unix worthy of the name.  You have to be
careful on the reads, though: you need to embed the size of the
datagram into itself so that you can be sure you don't get them
packed together.  If the datagrams are of fixed size, then
you don't even need this.

Most of the pipes I use this way have a datagram whose first
field is the size.  Atomic write(2) of each datagram into
the pipe.  The reader does a read(2) of the size field, followed
by a read sized to get the rest of the datagram.  No muss,
no fuss, and pretty darned fast, too.  And it works _everywhere_,
on every Unix/Linux/whateverix version known to Man.

-- Jim

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread James Bowlin

TL;DR: the current busybox (or my code) seems to be broken on one
or two systems out of thousands; crucial modules don't get loaded
using hotplug + coldplug.  Please give me a runtime option so I
can give my users a boot-time option to try the new approach.

On Fri, Mar 13, 2015 at 02:04 PM, Guillermo Rodriguez Garcia said:
 Perhaps my message is not getting across. What I am saying is
 that I am not sure your suggestion is actually something others
 would find useful. Other than the fact that you like it
 yourself I don't see a lot of enthusiasm around about it.

I've been following this thread on and off and I feel *very*
enthusiastic about Harald Becker's approach.  I kept silent
because there was already so much posting and I figured everyone
would finally get on board with Harald's suggestions.

I admit I build my own busybox but I would *hate* to have to make
this choice at build-time.  I want to  give my users a boot-time
option (failsafe or something like that) to go back to the
older, slower way.  Or maybe make the older way the default and
give them an option to use the new approach.  Making it a
build-time option is extremely hubristic; it assumes both the
current code and the new code are perfect.

I have a few users (out of many thousands) who have trouble
loading modules they need using the conventional busybox tools.
I enable hotplugging and then coldplug, loading everything from:

   /sys/devices -name modalias -exec cat '{}' + 2/dev/null

I've also used:

find /sys/devices -name uevent -exec sed -n \
's/^MODALIAS=//p' '{}' + 2/dev/null \

But after filtering through sort -u the outputs of the find
commands are identical.  So I have a failsafe fallback that
loads all modules from: /lib/modules/$(uname -r)

The alias for the needed module shows up in those lists but
they don't always get loaded.  Oddly enough the crucial modules
does *not* show up when I pipe that output through:

| xargs modprobe -a -D

But it shows up when I use the Debian modprobe.  The crucial
module does get loaded on my system when I use:

| xargs modprobe -a

This is a strange bug in modprobe -a -D but it does not explain
the bug that causes failure to load certain modules via hotplug +
coldplug.  Maybe the bug is in my code; maybe the bug is in
busybox.  I don't know for sure.  My guess is it is a glitch in
the hotplugging.  I have not heard back from the user who
reported the problem.  The next test would be to keep doing the
cold-plugging and see if that fixes the problem.

Last year I discovered and reported a bug in the small modprobe
when it did not load certain modules during a coldplug.  The bug
was fixed but I have since moved to the large modprobe.  I'm
hoping that a more streamlined and orderly hotplug solution in
busybox will fix the problem above so I don't have to ever resort
to loading all modules.  If Harald's solution is a runtime option
then I could make it a boot-time option and test it on thousands
of machines.  ISTM this is the only sane approach.

I also liked this suggestion from Natanael:

On Fri, Mar 13, 2015 at 10:33 AM, Natanael Copa said:
 While solving this, I would also like to find a way to load the
 MODALIAS without forking.

But I don't think this is the answer:

 One option is to add modprobe -i which reads modaliases from
 stdin with a timeout. Maybe that is a lower hanging fruit than
 mdev -i?

Since ISTM that can be handled with xargs (does that fork a
lot?). ISTM there is a lot of forking in the find commands as
illustrated above.  I think we would need to use:

modprobe --find-modalias /sys/devices

where the user gets to define which directory is searched.  There
could also be a:

modprobe --find-uevent /sys/devices

where the grep/sed is built-in as well.  OTOH, perhaps these
would be a nightmare code-wise.  I'd sure like to know if this
approach is significantly faster or not.

BTW: I try to time most things in my initrd init script to keep
an eye on what slows things down so we can stay as speedy as
possible.  I use:

cut -d  -f22 /proc/self/stat 

to get the time in hundredths of a second since the kernel
booted.  But this itself and the associated arithmetic is not
free time-wise.  So I disable most of the timing when I'm not in
debug mode.  If there is a better/faster way to get the time to
hundredths of a second or better, PLMK.  A shell built-in would
make me very happy.  Perhaps this seems way OT but ISTM that
precise timing is essential for comparing how fast things are
when the overall time is only a second or two.  If there is a
better way to get the time, PLMK; otherwise I suggest people use
the cut command above for timing how long things take.


Peace, James
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 16:53, Natanael Copa wrote:

I have the feeling (without digging into the busybox code) that making
mdev suitable for a longlived daemon will not be that easy. I suspect
there are lots of error handling that will just exit. A long lived
daemon would need log and continue instead.


The major mdev part, will not be converted in a long lived daemon. That 
is more or less code in a process doing a job like cat, but still you 
are right. It will take some time and has to be done carefully. No doubt.


One of the reasons to have that fifo supervisor is the failure 
management. Even if the parser / handler process dies, this is catched 
and can fire up a failure script (something we do not have now).


The kernel hotplug handler stays as a normal process at all, does only 
need to write the gathered info to the pipe instead of a function to 
handle the event operation, so it is mainly replacing the handler 
function with a write to the named pipe, and on the other sid the 
hotplug gathering part is replaced by a read from stdin (pipe) with timeout.


So major work will go to the code resorting parser / device operation 
handler, but I expect the need of doing a parser rewrite (which is 
straight forward for that simple syntax).


Don't misunderstand, it will be an expensive piece of work, but compared 
to finding a specific bug in a 3 line of a program from somebody 
else, the mdev code is simple ... and it's not my first Busybox hacking, 
it is only the first time do that as public discussion in this channel / 
list. I started creating specialized BB versions around 1995, so some 
experience now.


Did you note that pseudo code for the fifo supervisor? May be it hopped 
to the other thread. That is standard code, more or less, comparable to 
e.g. tcpsvd


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Guillermo Rodriguez Garcia

El viernes, 13 de marzo de 2015, James Bowlin bit...@gmail.com escribió:

 On Fri, Mar 13, 2015 at 08:33 PM, Guillermo Rodriguez Garcia said:
  You are talking about a possible bug in the current
  implementation.  In my opinion this is completely independent
  from whether a redesign/architecture change is required or
  wanted.

 ISTM you are assuming the redesign will not fix the bug.


No, I am not asumming that.

I am saying: if there is a bug then let's fix it. The discussion on whether
mdev needs a redesign is independent of that. In other words I am saying
that there is a bug is not in itself a reason to do a redesign.


 If the
 different versions are runtime options then it will be easy to
 see if the new versions fix the bug or not.  Let's make it easy
 for me to see if your assumption is correct or not.


There was no such assumption.

[...]

 the only sane
 approach is to let the choice be at runtime so there is a
 fallback in case there is a bug that only shows up on specific
 hardware.  This approach seems so obvious to me that I can't
 imagine it is controversial.


No controversy on my side. In fact I am not advocating compile time options
over runtime options, nor the opposite. I was just trying to understand why
Michael's proposal was not good enough for Harald.

Guillermo


-- 
Guillermo Rodriguez Garcia
guille.rodrig...@gmail.com
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Guillermo Rodriguez Garcia

Hi James,

El viernes, 13 de marzo de 2015, James Bowlin bit...@gmail.com escribió:

 TL;DR: the current busybox (or my code) seems to be broken on one
 or two systems out of thousands; crucial modules don't get loaded
 using hotplug + coldplug.  Please give me a runtime option so I
 can give my users a boot-time option to try the new approach.


You are talking about a possible bug in the current implementation. In my
opinion this is completely independent from whether a redesign/architecture
change is required or wanted.

[...]


 BTW: I try to time most things in my initrd init script to keep
 an eye on what slows things down so we can stay as speedy as
 possible.  I use:

 cut -d  -f22 /proc/self/stat

 to get the time in hundredths of a second since the kernel
 booted.  But this itself and the associated arithmetic is not
 free time-wise.  So I disable most of the timing when I'm not in
 debug mode.  If there is a better/faster way to get the time to
 hundredths of a second or better, PLMK.


On embedded targets I normally use grabserial for boot timing (
http://elinux.org/Grabserial)

Just in case it's useful.

Guillermo


-- 
Guillermo Rodriguez Garcia
guille.rodrig...@gmail.com
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Laurent Bercot


On 13/03/2015 21:32, Michael Conrad wrote:

I stand corrected.  I thought there would be a partial write if the
pipe was mostly full, but indeed, it blocks.


 Except you have to make the writes blocking, which severely limits
what the writers can do - no asynchronous event loop. For a simple
cat-equivalent between the netlink and a fifo, it's enough, but
it's about all it's good for. And such a cat-equivalent is still
useless.

 It's still a very bad idea to allow writes from different sources
into a single fifo when there's only one authoritative source of data,
in this case the netlink. If a process wants to read uevents from a
pipe, it can simply read from an anonymous pipe, being spawned by the
netlink listener. That's what my s6-uevent-listener and Natanael's
nldev do, and I agree with you: there's simply no need to introduce
fifos into the picture.

--
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread James Bowlin

On Fri, Mar 13, 2015 at 08:33 PM, Guillermo Rodriguez Garcia said:
 You are talking about a possible bug in the current
 implementation.  In my opinion this is completely independent
 from whether a redesign/architecture change is required or
 wanted.

ISTM you are assuming the redesign will not fix the bug.  If the
different versions are runtime options then it will be easy to
see if the new versions fix the bug or not.  Let's make it easy
for me to see if your assumption is correct or not.

It is much easier on my end to tell a user to use a boot
parameter than it is for me to build a new busybox for them and
then have them install a new initrd.

Even if the new designs do not fix the specific bug I mentioned,
since the redesigns are dealing with hardware related issues
(loading hardware specific modules) we really want to test the
new designs on a large variety of hardware.  ISTM the only sane
approach is to let the choice be at runtime so there is a
fallback in case there is a bug that only shows up on specific
hardware.  This approach seems so obvious to me that I can't
imagine it is controversial.

My access to thousands of machines has allowed me to catch and
report a nasty bug in modprobe last year.  I've also detected
another bug related to loading hardware specific modules but
this 2nd bug has been harder to nail down.  My point is that
if we make the new designs a runtime option then it makes it
easier and safer for me to test them on thousands of different
machines.  There may be others who are in a similar situation
who also have busybox code that runs on many different machines.


Peace, James

  Science is the belief in the ignorance of experts.
 -- Richard Feynman
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Michael Conrad


On 3/13/2015 11:21 AM, Harald Becker wrote:

On 13.03.2015 12:41, Michael Conrad wrote:

Stream-writes are not atomic, and your message can theoretically get
cut in half and interleaved with another process writing the same
fifo.  (in practice, this is unlikely, but still an invalid design)


---snip---
O_NONBLOCK disabled, n = PIPE_BUF
All n bytes are written atomically; write(2) may block if there is not
room for n bytes to be written immediately
---snip---



I stand corrected.  I thought there would be a partial write if the pipe 
was mostly full, but indeed, it blocks.




If someone really wants a netlink solution they will not be happy
with a fifo approximation of one.


You missed the fact, my approach allows for free selection of the 
mechanism. C hosing netlink means using netlink, as it should be. The 
event listener part is as small as possible and write to the pipe, 
which fire up a parser / handler to consume the event messages.


Are you suggesting even the netlink mode will have a process reading the 
netlink socket and writing the fifo, so another process and can process 
the fifo?  The netlink messages are already a simple protocol, just use 
it as-is.  Pass the



On 3/13/2015 12:14 PM, Harald Becker wrote:

The new code would not be run like a hotplug helper, it would be run as
a daemon, probably from a supervisor.  But the old code is still there
and can still be run as a hotplug helper.


The new code behaves exactly as the old code. When used as a hotplug 
helper, it suffers from parsing conf for each event. My approach is a 
splitting of the old mdev into two active threads, which avoid those 
problems even for those who like to stay at kernel hotplug.


Then it sounds like indeed, you are introducing new configuration steps 
for the old-style hotplug helper?  i.e. where does the fifo live?  who 
owns it? what security implications does this have?  Who starts the 
single back-end mdev processor?  If started from the hotplug-helper, who 
ensures that only one gets started?


If people have existing systems using hotplug-helper mdev, you can't 
just change the implementation on them in a way that requires extra 
configuration.  Everyone who has commented on this thread so far agrees 
with that.


-Mike
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 23:33, Laurent Bercot wrote:

On 11/03/2015 08:45, Natanael Copa wrote:

With that in mind, wouldn't it be better to have the timer code in
the handler/parser? When there comes no new messages from pipe
within a given time, the handler/parser just exists.


I've thought about that a bit, to see if there really was value in
making the handler exit after a timeout. And it's a lot more complex
than it appears, because you then get respawner design issues, the
same that appear when you write a supervisor.


Which issues?


What if the handler dies too fast and there are still events in the
queue ?



Should you respawn the handler instantly ?


spawning the handler is the job of the named pipe supervisor. At first
it checks the exit code of the dieing handler and spans a failure script
if not sucessfull. Then waits until data in pipe arrive (or is still
there = poll for reading), and finally span a new handler

The trick on this is, to hold the pipe open for reading and writing in
the supervisor. This way you avoid race conditions from recreating new
pipes, and catch even situation when an event arrive at the moment the
handler got a timeout and is dieing. Otherwise, does the supervisor not
touch the content transfered through the pipe.


That's exactly the kind of load you're trying to avoid by having a
(supposedly) long-lived handler. Should you wait for a bit before
respawning the handler ? How long are you willing to delay your
events ?


A bit more of checking is planned already, Currently I have an failure
counter and detect when parser successively dies unsuccessfully, but may
be we can add in an respawn counter, who triggers a delay (maybe
increasing) on to many respawns without processing all the pipe data,
but when handler exit and pipe is empty (poll), then respawn counter is
reset. So you get two or three fast respawns after handler dies (when
timeout on poll) and more data in pipe, then something seams to be
wrong, so start adding increasing delays before respawning. The normal
case is, when handler exit due to timeout, the pipe is empty, so we can
reset the counter and have no need to delay process respawn, as soon as
new data arrive in pipe. And when the respawn counter goes above some
limit or the handler dies unsuccessful, a failure script is spawned
first, with arguments programname, exit code or signal, failure count


It is necessary to ask these questions, and have the mechanisms in
place to handle that case - but the case should actually never
happen: it is an admin error to have the event handler die to fast.


admins don't make errors! ;)


So it's code that should be there but should never be used; and it's
already there in the supervisor program that should monitor your
netlink listener.


Ok, you expect the netlink listener be watched by a supervisor daemon? 
Fine so the fifo supervisor should also be watched, as it got forked 
from same process as the netlink reader ... that means when we detect 
handler failures, we can just die and let the outer supervisor do the job :)


When that happens the system is usually on it's way to hell ... and even 
if that happens, what does it mean to the system? ... hotplug events are 
no longer handled, we loose them and may have to re-trigger the plug 
events, as soon as hotplug events are processed again (however this is 
achieved) ... and in the worst case you are back at semiautomatic device 
management, calling mdev -s to update device file system.


... but consider conf file got vandalized, or the device file system ... 
how to suffer from this? ... do you expect to handle those? ... wouldn't 
it be better to reboot, after counting the failure in some persistent 
storage?




So my conclusion is that it's just not worth it to allow the event
handler to die. s6-uevent-listener considers that its child should
be long-lived;


That's the problem of spawning the handler in your netlink reader. The 
netlink reader has to open the pipe for writing in non blocking mode, 
then write a complete message as a single chunk, failure check the write 
(you always need and handle it), done. If open/write to pipe is not 
possible, the device plug system has gone and need restart, so let the 
netlink listener die (unusual condition). One critical condition should 
be watched and handled, when pipe is full and write (poll for write) has 
timeout, what than? ... but this is not different then in your solution.


--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Laurent Bercot


On 14/03/2015 01:20, Harald Becker wrote:

I'm using none blocking I/O and expect handling of failure
situations, e.g writing into pipe fails with EAGAIN: poll wait until
write possible with timeout. then redo write


 Hm, after checking, you're right: the guarantee of atomicity
applies with nonblocking IO too, i.e. there are no short writes.
Which is a good thing, as long as you know that no message will
exceed PIPE_BUF - and that is, for now, the case with uevents, but
I still don't like to rely on it.



Laurent, you still stuck on netlink! Using named pipe is requirement
for the hotplug helper stuff, how else should they get access to the
pipe, when not using a named pipe? And what is the difference between
fifos and pipes?


 I call pipe an anonymous pipe. And an anonymous pipe, created by
the netlink listener when it forks the event handler, is clearly the
right solution, because it is private to those two processes. With
a fifo, aka named pipe, any process with the appropriate file system
access may connect to the pipe, and that is a problem:
 - for writing: why would you need another process to write into the
pipe ? You have *one* authoritative source of information, which is
the netlink listener, any other source is noise, or worse, malevolent.
 - for reading: having several readers on the same pipe is the land
of undefined behaviour. You definitely don't want that.
 - generally speaking, fifos are tricky beasts, with weird and
unclear semantics around what happens when the last writer closes,
different behaviours wrt kernel and even *kernel version*, and more
than their fair share of bugs. Programming a poll() loop around a
fifo is a lot more complicated, counter-intuitive, and brittle, than
it should be (whereas anonymous pipes are easy as pie. Mmm... pie.)

 I'm talking about the netlink because it's a natural source of streamed
uevents, which is exactly what you want to give to a long-lived uevent
handler such as mdev -i.

 I really have no idea why you're fixated on that fifo idea when all
the pieces that you need already exist. A netlink listener forking a
long-lived event handler and transmitting events to it via an anonymous
pipe is the simplest design that will accomplish what you want.

 Pardon my bluntness, but I think you've been in write-only mode since
the beginning of the discussion, and it is irritating. You are certainly
very experienced, but you are not the only one who knows how Unix works;
you are convinced we are not understanding your great plan, or that we
want to prevent you from organizing your computers the way you want, or
that you need to explain to us how fifos and super-servers work; none
of that is true. This is the busybox mailing-list, we are not as clueless
or malevolent as some people you may have had the displeasure of working
with; and would you believe, some of us know how to design Unix software
that does not need a power plant to run.

 Please take a look at my s6-uevent-listener/s6-uevent-spawner, or at
Natanael's nldev/nldev-handler. The long-lived uevent handler idea is
a *solved problem*. Take the code, busyboxify it if you want, make it
into a single binary, whatever; all the tools you need are there.
 You have:
 - a short-lived uevent handler, registrable as a hotplug helper: mdev
 - a long-lived uevent handler: mdev -i (it's not there yet, but AIUI
that's what you wanted to work on)
 - a netlink listener: s6-uevent-listener or nldev
 - a coldplug trigger: mdev -s
 - just in case someone would want that (and until mdev -i actually
exists, we do want that), you even have long-lived programs suitable to
be spawned by a netlink listener, that themselves spawn a helper such
as mdev for every event: s6-uevent-spawner or nldev-handler.
The advantage of this construct over simply registering mdev in
/proc/sys/kernel/hotplug is that serializes events.

 It does not get any simpler, or more modular, than this set of tools.

 Your original plan was to:
 - write mdev -i: I think it's a good idea.
 - modify mdev -s to add more functionality than just triggering a
coldplug: I don't think it's good design but I don't care as long as I
can configure it out. Other people have also answered. The answers may
not have been the ones you were looking for, but you wanted feedback,
you got feedback.

 I'm not interested in providing 15 APIs to do the same thing. Users
don't like the netlink ? Tough, if they want serialized uevents. What
would you do if your kid wanted to drive a car but said he didn't like
steering wheels, would you build him a car with a joystick ?

 I provided you with clear designs and working code. So did other
people. (You said code is premature at this point. I'm sorry, but no,
code is not premature when the problem is solved, and if you're not
convinced, please simply study the code, which is extremely short.)
Now it's all up to you. I would like to see a mdev -i, can you work
on it ? If you prefer to keep beating around the bush and smoking crack

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 21:32, Michael Conrad wrote:

Are you suggesting even the netlink mode will have a process reading the
netlink socket and writing the fifo, so another process and can process
the fifo?  The netlink messages are already a simple protocol, just use
it as-is.  Pass the


You got the function of the fifo manager (or supervisor) wrong. This 
little process does never read or touch the fifo data, it's purpose is 
to fire up the parser (pipe consumer) when any gathering part as written 
something into the pipe and there is no running parser process. In 
addition this supervisor may spawn a failure_script, when the parser 
abort unexpectedly.


Have you ever used tcpsvd ?

This piece open a network socket, then accept incoming connections and 
pass the socket to a spawned service process.


The fifo manager does the same for the named pipe.

The data flow is:
  netlink daemon - pipe - parser
or
  hotplug helper - pipe - parser



The new code behaves exactly as the old code. When used as a hotplug
helper, it suffers from parsing conf for each event. My approach is a
splitting of the old mdev into two active threads, which avoid those
problems even for those who like to stay at kernel hotplug.


Then it sounds like indeed, you are introducing new configuration steps
for the old-style hotplug helper?


?

  i.e. where does the fifo live?

at a simple default: /dev/.xdev-pipe, because any reading of such 
parameters in the hotplug helper would slow down the operation. Remember 
hotplug helpers a spawned in parallel.



who  owns it?


The only user who's allowed to do netlink operation, load modules, 
create any device nodes, etc. - root:root



what security implications does this have?


Mode of the fifo will be 0600 = rw---


Who starts the single back-end mdev processor?


This is the job of the fifo manager (or named pipe supervisor). The 
processor, as you call it, is started on demand, when data is written to 
the fifo, the processor has to die when when idle for some time.



If started from the hotplug-helper, who ensures that only one gets started?


? started from the hotplug helper? the helper won't ever start anything, 
just:


hotplug helper:
  gather event information
  sanitize message
  open named pipe for writing
  (ok, if this open fails seriously, we are in big trouble)
  (true for many other such operations)
  (to be discussed what's best failure handling for this)
  if pipe is open, (safe) write the event message
  (the safe means, in loop and checking for success)
  exit 0

netlink reader:
  open named pipe
  (for failures here have already added an option)
  (will spawn a given script with failure reason)
  (or otherwise retries some times, then die)
  open netlink socket
  in an endless loop
wait for messages arriving
sanitize message
(safe) write the event message into the pipe

fifosvd:
  create named pipe (fifo)
  open fifo for reading and writing in none blocking mode
  in an endless loop
wait for data arriving in pipe (poll)
spawn the parser process redirecting stdin from fifo
wait for exit of the spawned process
if not exited successfully
  spawn the given failure script with arguments
  if failure script exits unsuccessful, then die

parser:
  read conf file into memory table
  while read next message from stdin with timeout
sanity checks of message (paranoia)
lookup device entry in memory table
do required operation for the message

Is this better for you?
I really hate code hacking before I'm able to finish planing.



If people have existing systems using hotplug-helper mdev, you can't
just change the implementation on them in a way that requires extra
configuration.


Which extra configuration?


Everyone who has commented on this thread so far agrees with that.


You definitely misunderstand my approach and how it works!

--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


Hi,

my original intention was to replace mdev with an updated version,
but as there where several complains which told they like to continue 
using the the old suffering mdev, I'm thinking about an alternative.


... but the essential argument came from James, fail save operation.

In my last summary I used the name xdev, just as a substitution to 
distinguish to current mdev. I don't think the name xdev would be a good 
choice to stay with, neither is nldev, as my approach allows to include 
/ used different mechanisms, ...


... so what would be the best choice for a name of this updated 
implementation of the device system management command?


With a different name, we can just leave the mdev code as is, and in a 
new applet with different code (no code sharing except usual libbb). 
Then you can opt in whichever version you like, or even both and chose 
during runtime which one to use.


... but if now someone complains about the big code size overhead when 
two device managers get included without code sharing, I send him an: 
kill -s 9 -1


... and I won't later to change the name of the new xdev implementation 
to the name mdev, because someone complains and want to use newer 
version but stay at the name mdev.



So call the command xdev for now, to distinguish to current mdev operation:

xdev -i
  - do the configured initial setup stuff for the device file system
(this is optional, but I like the one-shot startup idea)

xdev -f
  - starts up the fifo manager, if none running
(manual use is special purpose only)

xdev -k
  - disable hotplug handler, kill possibly running netlink daemon
(for internal and special purpose usage)
kill is not perfect yet, race condition when switching mechanisms
(needs more thinking)

xdev -p   (changed the parameter due to criticism)
  - select the kernel hotplug handler mechanism
(auto include -f and -k)

xdev -n
  - select the netlink mechanism and start daemon
(auto include -f and -k)

xdev -c
  - do the cold plug initiation (triggering uevents)
(also auto include -f)

xdev -s
  - do the cold plug as mdev -s does
(also auto include -f)

xdev (no parameter)
  - can be used as kernel hotplug helper

xdev netlink
  - the netlink reader daemon
(this is for internal use)

xdev parser
  - the mdev.conf parser / device operation process
(this is for internal use)

Command parsing will be stupid, if not an option, the first character is 
checked, so xdev pumuckl is same as xdev parser.


Where each of the mentioned parts except the fifo startup can easily be 
opted out on config, but otherwise add waste only some bytes in then 
binary. The fifo manager (named pipe supervisor) daemon itself is not 
included in this list, as a general fifosvd as separate applet seams to 
be the better place (just used internal by -f).


current mdev -s may get either (other combinations possible)

  xdev -s= do sys file scanning as mdev -s, but use new back end
  xdev -pc   = kernel hotplug mechanism, trigger the cold plug
   (uses xdev as hotplug handler)
  xdev -nc   = netlink mechanism, trigger the cold plug
   (starts xdev netlink reader daemon)

The only other change in the init scripts shall be to remove the old 
setting of mdev as hotplug helper in the kernel completely (done 
implicitly by -p).


All those may be combined with -i, then at first the configured setup 
operations are performed, thereafter the other requested actions.


That does *not mean* xdev -i does do any binary encoded setup stuff. It 
shall read the config file (suggestion is first try /etc/xdev-init.conf, 
then fallback to /etc/xdev.conf when former not exist) and invoke 
required operations for the configured setup lines. The setup lines are 
only used in xdev -i and otherwise ignored by xdev parser (like comments).



... more brain storming:

Just for those who may need such a feature: If you start the fifo 
supervisor manually, you can arrange to startup a different back end 
parser. This may be used to send all device event messages to a file:


#!/bin/busybox sh
tee -a /dev/.xdev-log | exec xdev parser

When this wrapper is used as back end, it will catch and append a copy 
of each event message to /dev/.xdev-log, which could itself be a named 
pipe to put messages in a file and watch file size to rotate files when 
required.


... but thinking of adding an xdev -l[LOG_FILE], which overwrite -f 
and setup the fifo supervisor to do the logic of the above wrapper, but 
without invoking an extra shell.


Some neat trick: xdev -l/dev/tty9 -pc
  start fifo supervisor
  set xdev as hotplug helper
  trigger the cold plug events
  beside normal parsing a copy of all event messages is written to tty

And again: This is not for normal usage, only for debugging purposes and 
those interested in lurking at their device messages.



... as a may be:

xdev -e[FAILURE_SCRIPT]
  spawn the failure script, when an operation has serious problems

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 21:43, Laurent Bercot wrote:

  Except you have to make the writes blocking, which severely limits
what the writers can do - no asynchronous event loop. For a simple
cat-equivalent between the netlink and a fifo, it's enough, but
it's about all it's good for. And such a cat-equivalent is still
useless.


I'm using none blocking I/O and expect handling of failure situations, 
e.g writing into pipe fails with EAGAIN: poll wait until write possible 
with timeout. then redo write



  It's still a very bad idea to allow writes from different sources
into a single fifo when there's only one authoritative source of data,
in this case the netlink.


Ohps!?

one netlink daemon - pipe - parser
  or
many hotplug helper - pipe - parser


If a process wants to read uevents from a
pipe, it can simply read from an anonymous pipe, being spawned by the
netlink listener. That's what my s6-uevent-listener and Natanael's
nldev do, and I agree with you: there's simply no need to introduce
fifos into the picture.


Laurent, you still stuck on netlink! Using named pipe is requirement for 
the hotplug helper stuff, how else should they get access to the pipe, 
when not using a named pipe? And what is the difference between fifos 
and pipes?


from man 7 pipe:

---snip---
Pipes  and  FIFOs (also known as named pipes) provide a unidirectional 
interprocess communication channel. A pipe has a read end and a write 
end. Data written to the write end of a pipe can be read from the read

end of the pipe.
---snip---

You see? So whey is pipe good (your choice) and fifo bad (mine)? They 
differ only in how to get access to the descriptors.


--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Laurent Bercot


On 11/03/2015 08:45, Natanael Copa wrote:

The netlink listener daemon will need to deal with the event handler (or
parser as you call it) dying. I mean, the handler (the parser) could
get some error, out of memory, someone broke mdev.conf or anything that
causes mdev to exit. If the child (the handler/parser) dies a new pipe
needs to be created and the handler/parser needs to be re-forked.

With that in mind, wouldn't it be better to have the timer code in the
handler/parser? When there comes no new messages from pipe within a
given time, the handler/parser just exists.


 I've thought about that a bit, to see if there really was value in
making the handler exit after a timeout. And it's a lot more complex
than it appears, because you then get respawner design issues, the
same that appear when you write a supervisor.

 What if the handler dies too fast and there are still events in the
queue ? Should you respawn the handler instantly ? That's exactly
the kind of load you're trying to avoid by having a (supposedly)
long-lived handler. Should you wait for a bit before respawning the
handler ? How long are you willing to delay your events ?

 It is necessary to ask these questions, and have the mechanisms in
place to handle that case - but the case should actually never happen:
it is an admin error to have the event handler die to fast. So it's
code that should be there but should never be used; and it's already
there in the supervisor program that should monitor your netlink
listener.

 So my conclusion is that it's just not worth it to allow the event
handler to die. s6-uevent-listener considers that its child should be
long-lived; if it dies, s6-uevent-listener dies too with an error
message. It will be picked up by its supervisor (and the new instance
will also respawn the handler). I'm happy to trade a bit of swap
space for a significant decrease in the amount of required code.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 00:05, Michael Conrad wrote:

On 03/12/2015 04:32 PM, Harald Becker wrote:

On 12.03.2015 19:38, Michael Conrad wrote:

On 3/12/2015 12:04 PM, Harald Becker wrote:

but that one will only work when you either use the kernel hotplug
helper mechanism, or the netlink approach. You drop out those who
can't / doesn't want to use either.


...which I really do think could be answered in one paragraph :-) If the
netlink socket is the right way to solve the forkbomb problem that
happens with hotplug helpers, then why would anyone want to solve it the
wrong way?  I don't understand the need.


To clarify,


Adding in here #0 - to not forget cold plug and semi automatic handling


   1 - kernel-spawned hotplug helpers is the traditional way,
   2 - netlink socket daemon is the right way to solve the forkbomb
problem


ACK, but #2 blocks usage for those who like / need to stay at #1 / #0


   3 - kernel-spawned fifo-writer, with fifo read by hotplug daemon is
solve it the wrong way.


NO!

This is splitting operation of a big process in different threads, using 
an interprocess communication method. Using a named pipe (fifo) is the 
proven Unix way for this ... and it allows #2 without blocking #1 or #0.




ohh, good question! ... ask Isaac! (answer in one paragraph?)


What I hear Isaac say is leave #1 (traditional way) alone.  I want to
keep using it.

I agree with him that it should stay.  But I would choose to use #2 if
it were available.  I am asking the purpose of #3.


The purpose of #3 is splitting the hotplug handler needed for 
traditional #1 from the suffering part, putting this with some 
rearranging into a separate process and use a proven interprocess 
communication methode (IPC). Now you may use whichever mechanism you 
like, and whichever method you chose, it will benefit from overall speed 
improvement (as events tend to arrive in bursts, and there is no more 
extra parsing for the 2nd, and following events). That's it.


I try to provide the work to step to #3, allowing everybody to use the 
mechanism he likes, Isaac on the other hand block any innovation, 
forcing me and others to either stay at #1 too, or chose a different / 
external program (with consequence of code doubling or complex code 
sharing, the opposite to clarity).




So I think your answer to my original question is the fifo design is a
way to have #1 and #2 without duplicating code.


The fifo design is a proven method to split operation of complex 
processes into smaller threads, using an interprocess communication method.



In that case, I would offer this idea:


All you do, is throwing in complex code sharing and the need to chose a 
mechanism ahead at build time, to allow for switching to some newer 
stuff ... but what about pre-generated binary versions, which mechanism 
shall be used in the default options, which mechanism shall be offered?


With netlink active (sure the proven and better way for the job), you 
hit those like Isaac. With netlink disabled, spreading newer technology 
to the wide is usually blocked (don't talking about some experts who 
know how to build there own BB version).


So why not allowing some innovation, to let the user chose which 
mechanism to use? What is wrong with this intention?


I neither want to reinvent the wheel, nor go the udev way to create a 
big monolithic block, but I like to get the ability to setup the system 
the way I like, without blocking others to use the plug mechanism they like.


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 10:30, Guillermo Rodriguez Garcia wrote:

There are many configuration options in BB that must be defined at
build time. I don't see why this one would be different.


You can activate both as the default (with cost of some byte overhead of 
code size), and let the user of the binary decide which mechanism he 
prefers, or even flip temporarily (without system interruption).




Users that want a functional solution will not probably care much
about the underlying implementation.


Exactly that means, using only one mechanism, is forcing those users to 
do it in a specific way, with all sort of consequences.



Those who want to tailor BB to fit their preferences most likely don't have a 
problem with building
their own BB.


Ok, and what's than wrong with my intended approach? You will be able to 
opt out most of the parts, if you like, even the parser / handler (think 
of you want to handle device management in a script without reinventing 
the event plug mechanism). It is a modular system, just tie those 
functions together you like.


A device handler could be:

#!/bin/sh
while read -tTIMEOUT message
  do
# now split the received message
...
# and setup your device node entries
...
  done
exit 0

... or think vise versa: Let someone find a new, ultra solution 
mechanism, but still want to use the conf parser / device handler back 
end. So opt out the plug mechanisms and let in the parser, then use a 
small external program with the ultra new solution mechanism (until it 
may be added as another optional mechanism in Busybox, like netlink).


A modular system, put together the required parts. Only cave-eat should 
be, you need to fire up some kind of a service daemon for the device 
management system ... else this start of the service daemon (fifo 
manager is just another name for it) needs to be coupled with some other 
part ... which hit Laurent's wishes, poking me for clarity and 
functional separation.


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Guillermo Rodriguez Garcia

2015-03-13 11:19 GMT+01:00 Harald Becker ra...@gmx.de:
 On 13.03.2015 10:30, Guillermo Rodriguez Garcia wrote:

 There are many configuration options in BB that must be defined at
 build time. I don't see why this one would be different.

 You can activate both as the default (with cost of some byte overhead of
 code size), and let the user of the binary decide which mechanism he
 prefers, or even flip temporarily (without system interruption).

Sure. But this same argument could also be applied to many other
options in BB which currently are defined at build time.
I am just saying that in most other areas, Busybox does not work like this.


 Users that want a functional solution will not probably care much
 about the underlying implementation.

 Exactly that means, using only one mechanism, is forcing those users to do
 it in a specific way, with all sort of consequences.

I understand your argument. You are saying that users should be able
to choose at runtime. What I say is that my impression is that most
users belong to one of the following two groups: Those who don't
really care, and those who are happy making this choice at build time.

Guillermo Rodriguez Garcia
guille.rodrig...@gmail.com
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Natanael Copa

On Thu, 12 Mar 2015 17:26:38 +
Isaac Dunham ibid...@gmail.com wrote:

 On Thu, Mar 12, 2015 at 04:04:41PM +0100, Harald Becker wrote:

 No, you misunderstand. Read my proposal below and tell me why this won't
 do what you're after, OTHER than the way mdev works now is broken/wrong,
 since that *isn't* universally accepted.
 
 As you stipulated about your design, applet and option names can be changed
 easily; but when I say new applet, I mean to indicate that this should
 be separate from mdev.
 
 * mdev (no options)
   ~ as it works now (ie, hotplugger that parses mdev.conf itself)
 * mdev -s (scan sysfs)
   ~ as it works now, or could feed the mdev parser

yes.

 * mdev -i/-p (read events from stdin)
   mdev parser, accepting a stream roughly equivalent to a series of
   uevent files with a separator following each event[1].
   To make it read from a named pipe/fifo, the tool that starts it
   could use dup2().

Yes. this is an option. I am not convinced it is the best solution
though. I think that the changes in mdev might be more intrusive than
mdev maintainer feel comfortable.

Still I don't have any better ideas.

While solving this, I would also like to find a way to load the
MODALIAS without forking.

One option is to add modprobe -i which reads modaliases from stdin with
a timeout. Maybe that is a lower hanging fruit than mdev -i?

 
 * new applet: nld
   Netlink daemon that feeds mdev parser.

I have implemented a proof-of-concept for this in case someone want
experiment with mdev -i:

http://git.alpinelinux.org/cgit/ncopa/nldev/tree/nldev.c

I am not convinced that this should be implemented as busybox applet.
It would be nice if it was though.

 * new applet: fifohp
   Your hotplug helper, fifo watch daemon that spawns a parser, and
 hotplug setup tool.
 
   I had actually thought that it might work at least as well if,
 rather than starting a daemon at init, the fifo hotplugger checks if
 there's a fifo and *becomes* the fifo watch daemon if needed.
 
   Also, I was thinking in terms of writing to a pipe because that lets
   us make sure events get delivered in full (ie, what happens if mdev
   dies halfway through reading an event?)

Yes. Partially written/read events needs to be handled properly, as
Laurent pointed out too.

 
 This way, 
 + mdev is the only applet parsing mdev.conf;
 + all approaches to running mdev are possible;
 + it's easy to switch from mdev to the new hotplugger,
   while still having mdev available if the new hotplugger breaks;
 + mdev is only responsible for tasks that involve parsing mdev.conf.
 
 And people who want the change don't have to do more than your
 proposal would require.

This is the direction I want go, yes.

 [1] The format proposed by Laurent uses \0 as an line terminator;
 I think it might be better to use something that's more readily
 generated by standard scripting tools from uevent files, which would
 make it possible to use cat or env to feed the mdev parser.

I liked the \0 as event terminator. Its simple.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Guillermo Rodriguez Garcia

Hi Harald,

2015-03-13 8:25 GMT+01:00 Harald Becker ra...@gmx.de:
 On 13.03.2015 00:05, Michael Conrad wrote:
 In that case, I would offer this idea:

 All you do, is throwing in complex code sharing and the need to chose a
 mechanism ahead at build time, to allow for switching to some newer stuff
 ... but what about pre-generated binary versions, which mechanism shall be
 used in the default options, which mechanism shall be offered?

 With netlink active (sure the proven and better way for the job), you hit
 those like Isaac. With netlink disabled, spreading newer technology to the
 wide is usually blocked (don't talking about some experts who know how to
 build there own BB version).

 So why not allowing some innovation, to let the user chose which mechanism
 to use? What is wrong with this intention?

 I neither want to reinvent the wheel, nor go the udev way to create a big
 monolithic block, but I like to get the ability to setup the system the way
 I like, without blocking others to use the plug mechanism they like.

Michael's proposal would allow you to do what you want to do, since
you are one of those experts who know how to build their own BB
version. So what's wrong with his proposal?

Guillermo
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Natanael Copa

On Thu, 12 Mar 2015 17:31:36 +0100
Harald Becker ra...@gmx.de wrote:

  Every gathering part grabs the required information, sanitizes,
  serializes and then write some kind of command to the fifo. The fifo
  management (a minimalist daemon) starts a new parser process when
  there is none running and watches it's operation (failure handling).
 
  If you are talking about named pipes (created by mkfifo) then your
  fifo approach approach will break here.
 
  Once the writing part of the fifo (eg a gathering part) is done
  writing and closes the writing part, the fifo is consumed and needs to
  be re-created. no other gathering part will be able to write anything
  to the fifo.
 
 ??? You don't understand operation of named pipes!
 
 Any program (with appropriate rights) may open the named pipe (fifo) 
 for writing. As long as data for one event is written in one big chunk, 
 it won't interfere with possible parallel writers. If the fifo device is 
 closed by a writer, this does not mean it vanishes, just reopen for 
 writing and write next event hunk).

What I meant was that reader needs to reopen it too.

  Basically what you describe as a fifo manager sounds more like a bus
  like dbus.
 
 It is the old and proven inter process communication mechanism of Unix, 
 nothing new.
 
 And fifo manager sounds really big, it is a really minimalistic 
 daemon, with primitive operation (does never touch the data in the fifo, 
 nor even read that data). It's main purpose beside creating the fifo, is 
 to fire up a conf parser / device operation process when required, and 
 to react on failure of them (spawn a script with args).

and connect the parser to the fifo?

my point is that you need a minimalist daemon that is always there. Why
not let that daemon listen on netlink events instead?

Since it is so simple as you say, why not write a short demo code?

May help me understand what you really mean.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Guillermo Rodriguez Garcia

2015-03-13 11:54 GMT+01:00 Harald Becker ra...@gmx.de:
 On 13.03.2015 11:25, Guillermo Rodriguez Garcia wrote:

 I understand your argument. You are saying that users should be able
 to choose at runtime. What I say is that my impression is that most
 users belong to one of the following two groups: Those who don't
 really care, and those who are happy making this choice at build time.


 Putting peoples in either-or-categories is not very handy, humans are to
 different, and you won't predict the exact needs of the next person coming
 around.

 So is either blocking any innovation or forcing one half of your
 either-or-group to do things in a specific way (if they doubt or not), your
 way? Not mine!

Perhaps my message is not getting across. What I am saying is that I
am not sure your suggestion is actually something others would find
useful. Other than the fact that you like it yourself I don't see a
lot of enthusiasm around about it.

Guillermo
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Michael Conrad


On 3/13/2015 9:48 AM, Harald Becker wrote:

On 13.03.2015 12:29, Michael Conrad wrote:

On 3/13/2015 3:25 AM, Harald Becker wrote:

   1 - kernel-spawned hotplug helpers is the traditional way,
   2 - netlink socket daemon is the right way to solve the forkbomb
problem


ACK, but #2 blocks usage for those who like / need to stay at #1 / #0


In that case, I would offer this idea:


All you do, is throwing in complex code sharing and the need to chose
a mechanism ahead at build time, to allow for switching to some newer
stuff ... but what about pre-generated binary versions, which
mechanism shall be used in the default options, which mechanism shall
be offered?


Please review it again.  My solution solves both #1 and #2 in the same
binary, with no code duplication.


At first complex code reusage, and then: How will you do it without 
suffering from hotplug handler problem as current mdev? I'm don't 
seeing, that you try to handle this problem. My solution is to enable 
kernel hotplug handler mechanism to also benefit and avoid that 
parallel parsing for each event.


... beside that, this close / open_netlink look suspicious, looks like 
possible race condition. 


I thought pseudocode would be clearer than English text, but I suppose 
my pseudocode is still really just English...  Maybe some comments will 
help.


The new code would not be run like a hotplug helper, it would be run as 
a daemon, probably from a supervisor.  But the old code is still there 
and can still be run as a hotplug helper.


  mdev_main() {
read_options();
load_config();
// if user requests --netlink mode, we act like a daemon
if (option_netlink) {
  // If --netlink-on-stdin then netlink is open for us already
  // if not, then we need to create our netlink socket
  if (!option_netlink_on_stdin) {
close(0); // the new socket will now be file descriptor 0
open_netlink_socket();
  }
  // use 'select()' to see if a new netlink message is ready.
  // if the user gave us a --timeout then we exit if no new
  //  netlink message in a certain amount of time
  while (select([0], timeout)) {
if (recv(0, message)) {
  // Netlink message is a list of variables.  We call 'setenv' 
for each.

  apply_env_from_message(message);
  // Now we have all the hotplug variables.  So call the old code.
  process_request();
}
// keep running in a loop until timeout (or forever if no timeout)
  }
}
else
#endif
  process_request();
  }

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Michael Conrad


On 3/13/2015 3:25 AM, Harald Becker wrote:

   1 - kernel-spawned hotplug helpers is the traditional way,
   2 - netlink socket daemon is the right way to solve the forkbomb
problem


ACK, but #2 blocks usage for those who like / need to stay at #1 / #0


In that case, I would offer this idea:


All you do, is throwing in complex code sharing and the need to chose 
a mechanism ahead at build time, to allow for switching to some newer 
stuff ... but what about pre-generated binary versions, which 
mechanism shall be used in the default options, which mechanism shall 
be offered?


Please review it again.  My solution solves both #1 and #2 in the same 
binary, with no code duplication.


I suggested wrapping #2 in a ifdef for the people who don't have netlink 
at all, such as on BSD, and also anyone who doesn't want the extra bytes.


-Mike
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Michael Conrad


On 3/13/2015 3:25 AM, Harald Becker wrote:
This is splitting operation of a big process in different threads, 
using an interprocess communication method. Using a named pipe (fifo) 
is the proven Unix way for this ... and it allows #2 without blocking 
#1 or #0.


Multiple processes writing into the same fifo is not a valid design.  
Stream-writes are not atomic, and your message can theoretically get cut 
in half and interleaved with another process writing the same fifo.  (in 
practice, this is unlikely, but still an invalid design)


If you want to do this you need a unix datagram socket, like they use 
for syslog.


It is also a broken approximation of netlink because you don't preserve 
the ordering that netlink would give you, which according to the kernel 
documentation was one of the driving factors to invent it.  If someone 
really wants a netlink solution they will not be happy with a fifo 
approximation of one.

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Natanael Copa

On Thu, 12 Mar 2015 19:05:01 -0400
Michael Conrad mcon...@intellitree.com wrote:

 In that case, I would offer this idea:

Thanks. Your pseudo code makes perfect sense to me.

...

 3.  Then to support the ability to launch mdev connected to a netlink 
 socket that already exists, and time out when not used,
 
mdev_main() {
  read_options();
  load_config();
  #ifdef FEATURE_MDEV_NETLINK
  if (option_netlink) {
if (!option_netlink_on_stdin) {
  close(0);
  open_netlink_socket();
}
while (select([0], timeout)) {
  if (recv(0, message)) {
apply_env_from_message(message);
process_request();
  }
}
  }
  else
  #endif
process_request();
}

I have the feeling (without digging into the busybox code) that making
mdev suitable for a longlived daemon will not be that easy. I suspect
there are lots of error handling that will just exit. A long lived
daemon would need log and continue instead.

If I am right about that, then those who wants the option_netlink case
will probably need run the netlink listener as separate process anyway
and instead set the timeout to never.

Then we end up with:

mdev_main() {
  read_options();
  load_config();
  #ifdef FEATURE_MDEV_STDIN
while (select([0], timeout)) {
  if (recv(0, message)) {
apply_env_from_message(message);
process_request();
  }
}
  #else
process_request();
  #endif
}

This would be enough for me.
 
 I think this will be even smaller than what you propose with the fifo.  
 It will do netlink, it will do the traditional hotplug helper, and even 
 allow the trick where a tiny daemon monitors netlink and can start mdev 
 in daemon mode on demand.
 
 -Mike

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 12:41, Michael Conrad wrote:

On 3/13/2015 3:25 AM, Harald Becker wrote:

This is splitting operation of a big process in different threads,
 using an interprocess communication method. Using a named pipe
(fifo) is the proven Unix way for this ... and it allows #2 without
blocking #1 or #0.


Multiple processes writing into the same fifo is not a valid design.


Who told you that? It is *the* proven N to 1 IPC in Unix.



Stream-writes are not atomic, and your message can theoretically get
cut in half and interleaved with another process writing the same
fifo.  (in practice, this is unlikely, but still an invalid design)


This is not completely correct:

picked out of Linux pipe manual page (man 7 pipe):

---snip---
O_NONBLOCK disabled, n = PIPE_BUF
All n bytes are written atomically; write(2) may block if there is not
room for n bytes to be written immediately
---snip---

As long as the message written to the pipe/fifo is less than PIPE_BUF,
the kernel guaranties atomicity of the write, message mixing only
happens when you write single messages  PIPE_BUF size, or use split
writing (e.g. do fprintf without setting to line buffer mode);

PIPE_BUF shouldn't be smaller than 512, but more likely 4k as on Linux 
(old), or even 64k (modern).




If you want to do this you need a unix datagram socket, like they use
 for syslog.


Socket overhead is higher than writing a pipe. Not only at code size,
much more like at CPU cost passing the messages.



It is also a broken approximation of netlink because you don't
preserve the ordering that netlink would give you, which according to
the kernel documentation was one of the driving factors to invent
it.


Sure. You say netlink is the better solution, I say netlink is it, but
next door you may find one who dislike netlink usage. We are not living
in a perfect world.

Ordering is handled different in mdev, that shall stay as is. My 
approach can't solve every single problem in this method, but that is up 
to those who like to stay, still they should gain from the speed 
improvement, and less problems from race conditions (each device 
operation is done without mixing with other device operation, as in pure 
parallelism). Additionally the hotplug helper speed is increased and 
does an really early exit compared to current mdev (or your approach). 
This should reduce system pressure and event reordering, but will indeed 
not avoid (needs to be synchronized) ... but I got a different idea: I 
heard about, the kernel provide a sequence number, which is used in mdev 
to do synchronization. May be we should just send the messages to the 
pipe as fast as posible, but prefix them with the event sequence number. 
The parser reads the message and checks the sequence number and pushes 
reordered messages in a back list, until right message receives (or some 
timeout - as done in mdev, but does not need reading / writing a file).


Oh I think the sequence number info is in the docs/mdev.txt description, 
including how this is done in mdev.




If someone really wants a netlink solution they will not be happy
with a fifo approximation of one.


You missed the fact, my approach allows for free selection of the 
mechanism. C hosing netlink means using netlink, as it should be. The 
event listener part is as small as possible and write to the pipe, which 
fire up a parser / handler to consume the event messages.


Where is there an approximation? Kernel hotplug helper mechanism is a 
different method, but also available for those who like to use them. 
Either one will have only some unused code part (if not opted out on 
config).


The difference is, default config can include both mechanisms in 
pre-build binaries. The user can chose and test the mechanism he wants, 
and then possibly build a specific version and opt out unwanted stuff.


--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Didier Kryn

There are interesting technical points in this discussion, but it 
turns out to be mostly about philosophy and frustration.


Harald, there are two points in your arguments which make no sense 
to me:


Le 12/03/2015 17:31, Harald Becker a écrit :
... because there are people, wo dislike using netlink and want use 
the kernel hotplug helper mechanism. That's it. Peoples preferences 
are different. Opt out the functions you dislike in BB config. 


Hotplug is KISS, it is stupid, maybe, but it is so simple that you 
can probably do the job with a script. The same serialization you 
propose to implement in user space by the mean of several processes, a 
named pipe and still the fork bomb, has been implemented in the kernel 
without the fork bomb: it is called netlink.


These people you are talking of, who would like to see hotplug 
serialized but do not want netlink, do they really exist? This set of 
people is most likely the empty set. In case these really exist, then 
they must be idiots, and then, well, should Busybox support idiocy?


Le 11/03/2015 19:02, Harald Becker a écrit :

It is neither a knowledge nor any technical problem, it is preference:
I want to have *one* statical binary in the minimal system, and being
able to run a full system setup with this (system) binary (or call it
tool set).


I agree it's fun to have all tools in one static binary. But I dont 
see any serious reason to make it an absolute condition. You speak of 
*preference*, but this very one looks pretty futile. I don't see the 
problem with having even a dozen applications, all static, why not, I'm 
also a fan of static linking.


Best regards.

Didier

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 12:29, Michael Conrad wrote:

On 3/13/2015 3:25 AM, Harald Becker wrote:

   1 - kernel-spawned hotplug helpers is the traditional way,
   2 - netlink socket daemon is the right way to solve the forkbomb
problem


ACK, but #2 blocks usage for those who like / need to stay at #1 / #0


In that case, I would offer this idea:


All you do, is throwing in complex code sharing and the need to chose
a mechanism ahead at build time, to allow for switching to some newer
stuff ... but what about pre-generated binary versions, which
mechanism shall be used in the default options, which mechanism shall
be offered?


Please review it again.  My solution solves both #1 and #2 in the same
binary, with no code duplication.


At first complex code reusage, and then: How will you do it without 
suffering from hotplug handler problem as current mdev? I'm don't 
seeing, that you try to handle this problem. My solution is to enable 
kernel hotplug handler mechanism to also benefit and avoid that parallel 
parsing for each event.


... beside that, this close / open_netlink look suspicious, looks like 
possible race condition.


What es wrong with splitting a complex job into different threads? The 
splitting alone, with inserting the named pipe (a long proven IPC), is 
enough to let even kernel hotplug mechanism based system, to gain speed 
improvement (and expected less memory usage on system startup). On 
modern multi core machines this will also allow the split operations to 
run on different CPU cores, with no extra cost. Synchronized operation. 
Where your Solution still hold the possibility of race conditions from 
possible parallelism.




I suggested wrapping #2 in a ifdef for the people who don't have netlink
at all, such as on BSD, and also anyone who doesn't want the extra bytes.


Therefor is my approach to allow for opt out in the config, but brings 
me otherwise to the idea to throw a compiler error, when netlink is 
build on system where not available, or optionally a warning and auto 
disable netlink support (usually a 4 liner snipped at the code start).


#if CONFIG_FEATURE_MDEV_NETLINK  NETLINK_NOT_AVAILABLE
  #define CONFIG_FEATURE_MDEV_NETLINK 0
  #warn This system lacks netlink support, netlink disabled
#endif

... or something similar.

--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 14:20, Didier Kryn wrote:

 There are interesting technical points in this discussion, but it
turns out to be mostly about philosophy and frustration.


ACK :(



 Hotplug is KISS, it is stupid, maybe, but it is so simple that you
can probably do the job with a script. The same serialization you
propose to implement in user space by the mean of several processes, a
named pipe and still the fork bomb, has been implemented in the kernel
without the fork bomb: it is called netlink.


You mixed some things, may be due to my poor English:

- current mdev suffers due to parallel reparsing conf for every event

- for those who like to stay at kernel hotplug mechanism, my approach 
gives some benefits, but will not solve every corner case; but it looks 
like, I could extend the approach somewhat, to do easier serialization 
(this needs some more checking).


- for those who want to use netlink, it is a small long lived netlink 
reader, pushing the event messages forward to the central back end (who 
frees resources when idle). That shall work as a netlink solution should


So where is your concern? Using a pipe for communication from to another 
process? This is Unix IPC / multi threading. Nothing else.




 These people you are talking of, who would like to see hotplug
serialized but do not want netlink, do they really exist? This set of
people is most likely the empty set. In case these really exist, then
they must be idiots, and then, well, should Busybox support idiocy?


As soon as you can proof, set set of users is empty or hold only a 
dropable minority, we can set default config for the kernel hotplug 
mechanism to off, so it will be excluded from pre-build binaries. When 
nobody more complains. That's it, you get a netlink solution.



 I agree it's fun to have all tools in one static binary. But I dont
see any serious reason to make it an absolute condition. You speak of
*preference*, but this very one looks pretty futile. I don't see the
problem with having even a dozen applications, all static, why not, I'm
also a fan of static linking.


I explained it already in the other thread to Laurent. It is my way. I 
try to avoid forcing others to do things in a specic way, but I hat to 
be forced by others. Busybox is a public tool set and shall provide the 
tools, which allow the user / admin to setup the system as he like. My 
approach is to let others use kernel hotplug mechanism, if they lik, but 
still gain performance boost, and users who like to use netlink, get a 
netlink solution. The cost is some unused bytes in the pre-build 
binaries (may be opted out in build config).


So where do I fail? Neither the optional event gathering parts (which 
will try to stay as fast / small as possible), nor the parser / device 
operation handler does work different than before (except some code 
reordering to avoid parsing conf file for each event). The job mdev does 
has just got split up in different threads, using a proven interprocess 
communication technique (IPC). Again, where do I fail?


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-13 Thread Harald Becker


On 13.03.2015 15:46, Michael Conrad wrote:

I thought pseudocode would be clearer than English text, but I suppose
my pseudocode is still really just English...  Maybe some comments will
help.


You can't fix the suffering of your code with some comments ... beside 
that, it looks like you dropped an #ifdef.



The new code would not be run like a hotplug helper, it would be run as
a daemon, probably from a supervisor.  But the old code is still there
and can still be run as a hotplug helper.


The new code behaves exactly as the old code. When used as a hotplug 
helper, it suffers from parsing conf for each event. My approach is a 
splitting of the old mdev into two active threads, which avoid those 
problems even for those who like to stay at kernel hotplug.


... and those who like to use netlink, can chose netlink and get 
netlink, using the same back end as the kernel hotplug.


So where I am wrong? What is the reason for your concern? Using a pipe 
as IPC? That fifo supervisor? What does my approach not do, you need 
(except completely staying at the old, suffering code)?


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker

Interrupts ... no trouble, everybody agree you need only one unblocked 
interrupt source, but never ask for the detail which one ... :)



Hi Laurent !


  I'm sorry if I came across as dismissive or strongly opposed to the

idea.

It was not my intent. My intent, as always, is to try and make sure that
potential new code 1. is worth writing at all, and 2. is designed the
right way. I don't object to discourage you, I object to generate
discussion.
Which has been working so far. ;)


ACK


  I understand your point. If I don't modify my mdev.conf, everything
will still work the same and I can script the functionality if I prefer;
but if you prefer centralizing the mounts and other stuff, it will also
be possible to do it in mdev.conf.


This is my primary intention.


  It is very reasonable. The only questions are:
  - what are the costs of adding your functionality to mdev.conf ?
  - are the benefits worth the costs ?


I don't know the exact cost ahead, but they should be not massive. look, 
we need an mkdir and symlink plus setting owner/permissions, and we need 
to setup an arg vector to call mount. The rest will be some rework of 
the parser, but I don't consider this into cost. So this shouldn't be 
much of increasing for the extended syntax.


Ok may be an include option in mdev.conf has some extra cost ... but 
benefit would be for all who like to split mdev.conf into separate 
files. Could be a BB config option allow include in mdev.conf.


Some more cost will result from possibility to use netlink, but the 
benefit will be not parsing the table for every event. So some cost is 
acceptable for me. Either part hotplug handler oder netlink reader may 
be excluded on BB config. I see no trouble grouping the code and 
excluding it when the option is deselected. Minor changes may persist, 
but shall not blow out the principles Busybox is based on, else I do 
have a big trash can ...




  As a preamble, let me just say that if you manage to make your syntax
extensions a compile-time option and I can deactivate them to get the
same mdev binary as I have today, then I have no objection, apart from
the fact that I don't think it's good design - see below.


Deactivating unwanted stuff in BB config is my intention. Deactivation 
will not result in same binary, due to some intended parser changes, but 
cost for this shouldn't be much notable ...


... and yes, I know to be picky on size restricted development. I 
started my programming practice on an 8008 with 512 Byte ROM and 128 
Byte RAM ... in addition to a Zuse Z31 computer (one of the first 
computer models using transistors) with 2000 words magnetic core memory 
of each 11 decimal digits ... :)




  My point, which I didn't make clear in my previous message because I was
too busy poking fun at you (my apologies, banter should never override
clarity), is that I find it bad design to bloat the parser for a hotplug
helper configuration file in order to add functionality for a /dev
initialization program, that has *nothing to do* with hotplug helper
configuration.


Ok, here we agree. I would highly deprecate blowing up the hotplug 
scanning, but here we come to the reason I stumbled about and asked if 
we can avoid this extra parsing all together, speeding up all. So I was 
back at netlink.




  The confusion between the two comes from two places:
  - the applet name, of course; I find it unfortunate that mdev and
mdev -s share the same name, and I'd rather have them called mdev
and mdev-init, for instance.


ACK, this could be done, for all the functionality ... but this will be 
a philosophical discussion. As mdev the hotplug helper does not use the 
code of mdev -s, there is no time cost to have that in one binary.




  - the fact that mdev's activity as a hotplug helper is about creating
nodes, fixing permissions, and generally doing stuff around /dev, which
is, as you say, logically close to what you need to do at initialization
time, so at first sight the configuration file looks like a tempting
place to put that initialization time work.



  But I maintain that mdev.conf should be for mdev, not for mdev -s.
mdev -s is just doing the equivalent of calling mdev for every device
that has been created since boot. If you make a special section in
mdev.conf for mdev -s, this 1. blurs the lines even more, which is not
desirable, and 2. bloats the parser for mdev with things it does not
need after the first mdev -s invocation.


To 1. - ACK, see below

To 2. - as I intent to avoid parsing on every event, that is I like to 
parse all rules into memory table and then scan only memory table for 
each arriving event, The extra code in the parser does not matter to 
your concern. Beside this I tried to chose syntax carefully, to produce 
not much overhead. We have two checks, one to the last char of the regex 
(slash means directory, atsign means symlink - ignored on hotplug) and 
the second check is to the percent sign of the mount file system type.

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Natanael Copa

On Wed, 11 Mar 2015 14:30:13 +0100
Harald Becker ra...@gmx.de wrote:

 Hi Natanael !
 Hi Isaac !
 
 Looks like you misunderstand my approach for the mdev changes ... ok, 
 may be I explained it the wrong way, so lets go one step back and start 
 with a wider view:
 
 IMO Busybox is a set of tools, which allow to setup a working system 
 environment. How this setup is done, is up to the distro / system 
 maintainer, that is it is up their preferences.
 
 I really like the idea to have an optimized version using netlink for 
 the hotplug events and avoid unnecessary forking on each event, but 
 there are other people which dislike that approach and prefer to use the 
 hotplug handler method, or even ignore the hotplug feature completely 
 and setup device nodes only when required (semi automatic, not manual 
 mknod, but manual invoking of mdev).
 
 The world is not uniform, where all people share the same preferences, 
 so we need to be polite to accept those different kinds of preferences 
 and don't try to force someone to setup their system in a specific way.
 
 Right? ... else we would be at the end of discussion and the end of my 
 project approach :(
 
 ... but I think you will agree:
 
 As Busybox is the used tool set, it shall provide the tools for all 
 users, and shall not try to force those users to use netlink, etc.

So netlink listener should not be implemented in mdev.

I agree on that.
 
...
 
 My idea is a fifo approach.

FIFO = first-in-first-out

I assume that you are talking about named pipes (aka fifos)
http://en.wikipedia.org/wiki/Named_pipe

 This allows splitting the device management 
 functionalities. Nevertheless which approach to gather the device 
 information is used, the parser and device handling part can be shared 
 (even on a mixed usage scenario).

 
 So we have the following functional blocks for our device management:
 
 - initial setup of the device file system environment
(yes, can be done by shell scripting, but it is a functional block)
 
 - starting the fifo management and automatic parser invocation
(long lived minimalistic daemon)
 
 - manual scanning of the sys file system and gathering device info
 
 - setting up the usage of the hotplug helper system
(check fifo availability and set hotplug helper in kernel)
 
 - an hotplug helper spawned by the kernel on every event
(should be as small / fast as possible)
 
 - a netlink based event receiptor
(long lived daemon, small memory foot print)

Why do you need a hotplug helper spawned by kernel when you have a
netlink listener? The entire idea with netlink listener is to avoid the
kernel spawned hotplug helper.

It simply does not make sense to have both.

 - the device node handling part
(conf table parser / calling required operation)

 Where the gathering parts may be used according to the user
 preferences (and may be opted out on BB configuration).
 
 Every gathering part grabs the required information, sanitizes, 
 serializes and then write some kind of command to the fifo. The fifo 
 management (a minimalist daemon) starts a new parser process when
 there is none running and watches it's operation (failure handling).

If you are talking about named pipes (created by mkfifo) then your
fifo approach approach will break here.

Once the writing part of the fifo (eg a gathering part) is done
writing and closes the writing part, the fifo is consumed and needs to
be re-created. no other gathering part will be able to write anything
to the fifo.

Basically what you describe as a fifo manager sounds more like a bus
like dbus.

I think you are on wrong way. Sorry.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Laszlo Papp

On Wed, Mar 11, 2015 at 7:02 PM, Harald Becker ra...@gmx.de wrote:
 On 11.03.2015 16:21, Laurent Bercot wrote:

 I don't understand that binary choice... You can work on your own
 project without forking Busybox. You can use both Busybox and your
 own project on your systems. Busybox is a set of tools, why should it
 be THE set of tools ?


 Sure, I know how to do this, I started creating adapted Busybox versions to
 specific needs for a minimalistic 386SX Board. Around 1995, or so ... wow,
 long time now :)

 It is neither a knowledge nor any technical problem, it is preference:
 I want to have *one* statical binary in the minimal system, and being
 able to run a full system setup with this (system) binary (or call it
 tool set). All I need then is the binary, some configs, some scripts
 (and may be special applications). I even go so far, to run a system
 with exactly that one binary only, all other applications functions are
 done with scripting (ash, sed, awk). Sure those are minimalist
 (dedicated systems), but they may be used in a comfortable manner.

 I even started a project to create a file system browser (comparable to
 Midnight Commander, with no two pane mode but up to 9 quick switch
 directories), using only BB and scripting. All packed in a (possibly self
 extracting) single shell script. The only requirements to run this, is
 (should be) a working BB (defconfig) environment, usual proc, sys, dev,
 setup and a writable /tmp directory (e.g. on tmpfs). The work for this was
 half way through to first public alpha, then Denys reaction on a slight
 change request was so frustrating that I pushed the project into an
 otherwise unused archive corner, and stopped any further development.


 I'm not sure how heavily mdev [-s] relies on libbb and how hard it
 would be to extract the source and make it into a standalone, easily
 hackable project, but if you want to test architecture ideas, that's
 the way to go - copy stuff and tinker with it until you have
 something satisfying.


 I always did it this way, and never posted untested stuff, except some
 snippets when one asked for something and I quickly hacked something for
 an answer (with mark set as untested).


 ... if not, you still have your harald-mdev, and you can still use it
 along with Busybox - you'll have two binaries instead of one, but
 even on whatever tiny noMMU you can work on, you'll be fine.


 Sure, I could have two, three, four, ten, twenty, hundred, ... programs, but
 my preference is to have *one* statically linked binary for the complete
 system tool set (on minimal systems).

So why don't you write such a binary wrapping busybox then and other
things? I think KISS principle still ought to be alive these days. Not
to mention, Denys already cannot cope with the maintenance the way
that it would be ideal. For instance, some of my IMHO serious bugfixes
are uncommented. Putting more and more stuff into busybox would just
make the situation worse and more frustrating, sorry. I really do not
want busybox to follow the systemd way. On the positive side of
systemd, they at least have far more resource than what Denys can
offer, at least in my opinion, so ...

 ... Thats the reason why I dislike and don't use your system approach :( ...
 otherwise great work :)


 That does not preclude design discussions, which I like, and which
 can happen here (unless moderators object), and people like Isaac,
 me, or obviously Denys, wouldn't be so defensive - because it's now
 about your project, not Busybox; you have the final say, and it's
 totally fine, because I don't have to use harald-mdev if I don't want
 to.


 One of the things I really hate, is to force someone doing something
 (especially in a specific way), only topped by someone else forcing me to do
 something in a specific way :( ...

 ... so I always try to do modifications in a way which let others decide
 about usage, expecting not to break existing setups (at least without asking
 ahead if welcome). Slight modifications may happen from modifications (e.g.
 different parameter notion), if unavoidable, but they shall not require
 complete changes in system setup.

 --
 Harald

 ___
 busybox mailing list
 busybox@busybox.net
 http://lists.busybox.net/mailman/listinfo/busybox
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker


Hi Isaac !

On 12.03.2015 02:05, Isaac Dunham wrote:

I just don't think you're quite thinking through exactly what this means.

In which sense? Let me know, what I got wrong. I'm a human, making mistakes,
not a perfect machine.


It seems like you want to force everyone who uses mdev to use a multi-part
setup.


Whops, you got that definitely wrong! Who told you, I want to force you 
to use a different setup. My clear intention, was adding some extra 
flexibility, without breaking existing usage.


... but criticism arose, which poked for more clarity. Only due to this 
and the wish to allow fitting as much preferences of people as possible, 
there might result a slight change: One extra operation may be required 
(when not not combined / hidden in the mdev system for clarity). At the 
location you setup the hotplug handler or do mdev -s you either need a 
slight change of the command parameters or need to insert a single extra 
command (details on this have not been discussed yet). But that's it.


Are you really concerned about inserting a single extra line in your 
startup scripts or having a modification to one of those lines? That way 
you would block any innovation, *and* the needs of other people.




Specifically, you are proposing to *replace* the standard hotplugger design
(where the kernel spawns one program that then sets up the device itself)
with a new design where there's a hotplugger that submits each event to
a longer-lived program, which in turn submits the events to a program
that creates/sets up the devices.


You say I propose a change of the device management system?

The mdev system suffer on event bursts and this has been grumbled about 
several times in the past. What I'm now trying is to build a solution 
for this into Busybox. Not reinventing the wheel, but implementing known 
solutions.



I am saying, don't force this design on people who want the hotplug helper
that the kernel spawns to finish things up.


The only way to solve this, would be to let mdev as it is, and create an 
additional netlink-mdev, which brings us as at the situation where it 
gets complicated (or at least complex) to share code and work between 
both of them. Selection between both (when you don't want to include 
both hunks), needs to be done with BB configuration, which make all 
thinks like pre-build binaries a mess (how many different versions shall 
be maintained? Only yours? Only mine?).


To solve this conflict and to also give the current device management 
system some speed improvement on burst, I tried to find a solution. This 
solution centers on the most important problem, the parallelism and 
parsing the conf for each single hotplug event.


So I *propose* to split the function of mdev into two separate parts. 
Part one will contain the kernel hotplug helper part (and should be as 
small and fast as possible), and part two the parsing of the conf and 
device node operations ... with the requirement of communication between 
any number of running part ones and the single part two. To reduce cost 
on those part one hunks (remember they shall be as fast as possible), 
and the availability of failure management (which BB currently 
completely lack), the communication between the parts need to be watched 
by a small helper daemon. A very minimalist daemon, which doesn't touch 
any data, it just fire up the device operation stuff, when it's required 
and wait until this process dies. Then goes back to wait for more events 
to arrive (this not like udev).


Do I wish to overcome the suffering of Busybox device management? Yes.

Do this need some changes? Yes, I propose as one or two line change in 
your startup files, with the benefit of speed improvement.


Do I otherwise propose to change your system setup or flip to using 
netlink operation? *NO*


Hence, I do not force anybody to change from using the kernel hotplug 
feature. I improve, that mechanism ... with the ability to add further 
mechanisms, without duplicating code. One shall be a netlink reader in 
BB, the other may be external programs of others, with a better 
interface to use device management functions from there.


Are your really complaining against any kind of innovation? Any step to 
overcome long problems, discussed several times ... but dropped, mostly 
due to the amount of required work?


Do I *propose* some innovation? Yes.

Do I propose to force someone changing to different mechanism? No.


Agreed.
But I would include hotplug daemons under etc.


I used etc. so add any similar type of program you like, including
hotplug daemons ... but stop, what are hotplug daemons? Do you mean
daemons like udev? udev uses netlink reading. Otherwise I know about hotplug
helper programs (current mdev operation), but not about hotplug daemons.


That's the best description I can come up with for the FIFO reader that
you proposed that would read from a fifo and write to a pipe, monitoring
the state of mdev -p


Wow? You got it wrong!

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker


Hi !

To Michael:

Don't be confused, Natanael provided an alternative version to achieve 
the initial device file system setup (which isn't bad, but may have it's 
cons for some people on small resource constrained systems of the 
embedded world). So I left it out for clarity ... but still may be 
implemented / used as an alternative method for setup.



To Natanael:

On 12.03.2015 10:14, Natanael Copa wrote:

- The third method (scanning sys file system) is the operation of mdev
-s (initial population of device file system, don't mix with mdev w/o
parameters), in addition some user take this as semi automatic device
management (embedded world)


I disagree here. mdev -s solves a different problem.


No.


there are 2 different problems.

1) handle hotplug events.




There are only 2 methods for this problem:
   A) using 1 fork per event (eg /sbin/mdev
 in /proc/sys/kernel/hotplug). this is simple but slow.


This is mdev (without any parameters).



   B) reading hotplug events via netlink using a long lived daemon.
   (like udev does)


Currently not in mdev, but shall be an alternative mechanism in my 
implementation, so you may chose and use the mechanism you like.



2) do cold plugging. at boot you need configure the devices kernel
already know about.


This is the operation of mdev -s



B) solve problem 1 mentioned above (set up a hotplug handler)
and then trigger the hotplug events. Now you don't need scan the
sysfs at all.


But still there are people in the embedded world, who like (or are 
forced) to use semi automatic device handling, that is calling something 
like mdev -s to scan sys file system for new devices.


My approach is to give them all the possibility to device management 
withe mechanism they like, without maintaining different device 
management systems and duplicating the code.


Three different mechanisms, with three different front end's, and one 
shared back end.


... the only thing I see is a difference in initial device system 
population. You provided a different method to trigger the coldplug 
events (and yes, I understand that approach), but that one will only 
work when you either use the kernel hotplug helper mechanism, or the 
netlink approach. You drop out those who can't / doesn't want to use either.


I left that out in the short summary for Michael, but didn't forget 
your hint. May be we can have an additional alternative for the setup 
part, implementing event triggering (still we do the sys file system 
scan, but with different handling then) ... or you do it in scripting, 
setup your hotplug handler or netlink listener and then run a script to 
trigger the plug events (nice idea otherwise, I like it, but won't 
unnecessary drop people on the other end).



What I currently do is:
- for problem 1 (dealing with hotplug events) I use method A.

- for problem 2 (set up devices at boot) I use method A because my
   hotplug handler is slow due to many forks.

What I would like to do is switch to method B for both problem 1 and
method B for problem 2 too. However, I want the long lived daemon
consume less memory and i want be able to use the mdev.conf syntax for
configuring the devices.


No problem, my approach shall give the possibility to do it the way you 
like, without blocking the others. Still you need to do the following 
steps on system startup:


- initial creation of the device file system (out of scope of mdev)

- prepare your system for hotplug events, either kernel method or 
netlink  (if you go that way)


- trigger initial device file system population (cold plug)
  (may be done in two ways, yours or the old mdev -s)



- So how can we have all three methods without duplication of code, plus
how can we speed up the kernel hotplug method?


IMHO, there are not 3 methods so the rest of discussion is useless.


You are going to force others to do it the way you like and everything 
other is of no interest?


I'm trying to give most people the possibility to do it the way they 
like, your way inclusive! ... without blocking other methods (or may be 
external implementations with reusage of conf / handling back end).


... as other functions in BB, unwanted functionalities may be opted out 
in build config.


... so this would mean for you: Include the back end and the netlink 
into your Busybox, do the usual device file system creation, activate 
the netlink handler (shall auto start the fifo watcher), then trigger 
cold plug events (undiscussed if done with script or added to binary).


Anything wrong with this?

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Laszlo Papp

On Thu, Mar 12, 2015 at 6:00 PM, Harald Becker ra...@gmx.de wrote:
 Hi Laszlo !

 So why don't you write such a binary wrapping busybox then and other
 things? I think KISS principle still ought to be alive these days.
 Not to mention, Denys already cannot cope with the maintenance the
 way that it would be ideal. For instance, some of my IMHO serious
 bugfixes are uncommented. Putting more and more stuff into busybox
 would just make the situation worse and more frustrating,


 The usual way of development is:

 (1) Planning the work to do (the step we are discussing here)

 (2) Code hacking (what I will do next)

 (3) Preliminary Testing (also mine job)

 (4) Offer access to those who like (for further testing)

 (5) fixing complains

 (6) putting into main stream (or accessible by the rest)

 Right? So what is your complain?


 sorry. I really do not want busybox to follow the systemd way.


 Who told you, I'm trying to go that way. My intention is to overcome the
 mdev problems, and allow those who like to use the netlink interface. I
 dislike encoding any fixed functionality in a binary, and don't force
 anybody to use possible extensions.

 Laurent poked me to more clarity, which would mean to split early
 initialization from mdev operation, with the cave eat of a slight change in
 init scripts (may be one more command or some extra
 command parameter, could be done automatic, but than different
 functionalities in applet - may be more discussion required on that). Beside
 that, it's shall be up to the system maintainer, to chose which device
 management mechanism to use (BB shall provide the tools for that, small
 modular tools, bound together by admin - no big monolithic).


 On the positive side of systemd, they at least have far more resource
 than what Denys can offer, at least in my opinion, so ...


 You are talking about development resources? Here they are! I'm willing to
 do that job, not asking for someone else doing the work.

It is not only about feature development resources, but also
maintenance. Denys will be the maintainer for new busybox code as far
as I am aware. I have not seen this model changing for a very long
while, sadly.

Once, I tried to ask for changing that model, but I was apparently
just shooting myself in the foot based on the feedback.

It is nice that you are trying to help and I certainly appreciate it,
but why cannot you simply do that job nicely outside busybox where
*you* have to be responsible for that project? It would be an explicit
way of enforcing KISS and not putting more burden on Denys.

He may enjoy maintaining more and more code, but in all honesty with
due respect and appreciation, I do not enjoy when developers do not
get response for their patches and they kind of all depend only on
Denys with regards to upstreaming.

It is also a bit demotivating that we do not get early feedback and
then we realize that our patches are completely reworked by Denys. It
is unfortunate that our work almost goes to /dev/null. I do not feel
appreciated.

This is all happening because of more and more complex stuff to be
maintained by one person. It is not a personal offense against Denys
to be fair. It is a model I very much disagree with in general.

If you can convince the busybox community to split up the
maintainership, perhaps that would be a completely different
discussion to start with, but in all honesty, I do not like these
monolythic projects. I still stick by that KISS is a good thing. If
I could, I would personally replace busybox with little custom tools
on my system, but I currently do not have the resource for that.
Therefore, all the complexities and non-kiss that goes in is something
I need to accept.


 I'm asking about, which preferences other people have, so I'm able to get
 the right decisions, before I start hacking code ... so what's wrong?

Asking for feedback is good, nothing wrong in there; putting this into
busybox this way is wrong on the other hand IMHO.


 --
 Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker


Hi Laszlo !


So why don't you write such a binary wrapping busybox then and other
things? I think KISS principle still ought to be alive these days.
Not to mention, Denys already cannot cope with the maintenance the
way that it would be ideal. For instance, some of my IMHO serious
bugfixes are uncommented. Putting more and more stuff into busybox
would just make the situation worse and more frustrating,


The usual way of development is:

(1) Planning the work to do (the step we are discussing here)

(2) Code hacking (what I will do next)

(3) Preliminary Testing (also mine job)

(4) Offer access to those who like (for further testing)

(5) fixing complains

(6) putting into main stream (or accessible by the rest)

Right? So what is your complain?



sorry. I really do not want busybox to follow the systemd way.


Who told you, I'm trying to go that way. My intention is to overcome the
mdev problems, and allow those who like to use the netlink interface. I
dislike encoding any fixed functionality in a binary, and don't force
anybody to use possible extensions.

Laurent poked me to more clarity, which would mean to split early
initialization from mdev operation, with the cave eat of a slight change 
in init scripts (may be one more command or some extra
command parameter, could be done automatic, but than different 
functionalities in applet - may be more discussion required on that). 
Beside that, it's shall be up to the system maintainer, to chose which 
device management mechanism to use (BB shall provide the tools for that, 
small modular tools, bound together by admin - no big monolithic).




On the positive side of systemd, they at least have far more resource
than what Denys can offer, at least in my opinion, so ...


You are talking about development resources? Here they are! I'm willing 
to do that job, not asking for someone else doing the work.


I'm asking about, which preferences other people have, so I'm able to 
get the right decisions, before I start hacking code ... so what's wrong?


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Isaac Dunham

On Thu, Mar 12, 2015 at 04:04:41PM +0100, Harald Becker wrote:
 Hi Isaac !
 
 On 12.03.2015 02:05, Isaac Dunham wrote:
 I just don't think you're quite thinking through exactly what this means.
 In which sense? Let me know, what I got wrong. I'm a human, making mistakes,
 not a perfect machine.
 
 It seems like you want to force everyone who uses mdev to use a multi-part
 setup.
 
 Whops, you got that definitely wrong! Who told you, I want to force you to
 use a different setup. My clear intention, was adding some extra
 flexibility, without breaking existing usage.
 
 ... but criticism arose, which poked for more clarity. Only due to this and
 the wish to allow fitting as much preferences of people as possible, there
 might result a slight change: One extra operation may be required (when not
 not combined / hidden in the mdev system for clarity). At the location you
 setup the hotplug handler or do mdev -s you either need a slight change of
 the command parameters or need to insert a single extra command (details on
 this have not been discussed yet). But that's it.

 Are you really concerned about inserting a single extra line in your startup
 scripts or having a modification to one of those lines? That way you would
 block any innovation, *and* the needs of other people.

No, you misunderstand. Read my proposal below and tell me why this won't
do what you're after, OTHER than the way mdev works now is broken/wrong,
since that *isn't* universally accepted.

As you stipulated about your design, applet and option names can be changed
easily; but when I say new applet, I mean to indicate that this should
be separate from mdev.

* mdev (no options)
  ~ as it works now (ie, hotplugger that parses mdev.conf itself)
* mdev -s (scan sysfs)
  ~ as it works now, or could feed the mdev parser
* mdev -i/-p (read events from stdin)
  mdev parser, accepting a stream roughly equivalent to a series of
  uevent files with a separator following each event[1].
  To make it read from a named pipe/fifo, the tool that starts it
  could use dup2().

* new applet: nld
  Netlink daemon that feeds mdev parser.

* new applet: fifohp
  Your hotplug helper, fifo watch daemon that spawns a parser, and hotplug
  setup tool.

  I had actually thought that it might work at least as well if, rather
  than starting a daemon at init, the fifo hotplugger checks if there's
  a fifo and *becomes* the fifo watch daemon if needed.

  Also, I was thinking in terms of writing to a pipe because that lets
  us make sure events get delivered in full (ie, what happens if mdev
  dies halfway through reading an event?)


This way, 
+ mdev is the only applet parsing mdev.conf;
+ all approaches to running mdev are possible;
+ it's easy to switch from mdev to the new hotplugger,
  while still having mdev available if the new hotplugger breaks;
+ mdev is only responsible for tasks that involve parsing mdev.conf.

And people who want the change don't have to do more than your proposal
would require.

[1] The format proposed by Laurent uses \0 as an line terminator;
I think it might be better to use something that's more readily
generated by standard scripting tools from uevent files, which would
make it possible to use cat or env to feed the mdev parser.

Thanks,
Isaac Dunham
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Laurent Bercot


On 12/03/2015 18:26, Isaac Dunham wrote:

[1] The format proposed by Laurent uses \0 as an line terminator;
I think it might be better to use something that's more readily
generated by standard scripting tools from uevent files, which would
make it possible to use cat or env to feed the mdev parser.


 An uevent sent by the netlink is already a series of null-terminated
strings. If you want to make text-scripting tools able to process
those messages, then you need to make the netlink listener convert
the events.

 The advantages of \0 terminators is that they can't appear anywhere
else in strings. Changing the terminators requires either a quoting /
parsing layer, which is hard and expensive, or making assumptions on
the format of the messages.

 It would be feasible, for now, to assume that \n does not appear
in uevent strings, and to replace all instances of \0 (including my
use of an extra \0 as an event terminator) with \n. But there's no
guarantee that \n won't appear in a message in the future, and I'd
rather avoid introducing constraints that don't need to be introduced.
That's my rationale for sticking with \0.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker


Hi Natanael !

 I assume that you are talking about named pipes (aka fifos)

http://en.wikipedia.org/wiki/Named_pipe


Ack, fifo device in the Linux / Unix world.


Why do you need a hotplug helper spawned by kernel when you have a
netlink listener? The entire idea with netlink listener is to avoid the
kernel spawned hotplug helper.


... because there are people, wo dislike using netlink and want use the 
kernel hotplug helper mechanism. That's it. Peoples preferences are 
different. Opt out the functions you dislike in BB config.


... but this is vise versa, for those who chose to use the kernel 
hotplug mechanism.




It simply does not make sense to have both.


Both active at the same time? Sure! ... This is not the intention. I've 
been talking about the functionalities, which need to be implemented.




Every gathering part grabs the required information, sanitizes,
serializes and then write some kind of command to the fifo. The fifo
management (a minimalist daemon) starts a new parser process when
there is none running and watches it's operation (failure handling).


If you are talking about named pipes (created by mkfifo) then your
fifo approach approach will break here.

Once the writing part of the fifo (eg a gathering part) is done
writing and closes the writing part, the fifo is consumed and needs to
be re-created. no other gathering part will be able to write anything
to the fifo.


??? You don't understand operation of named pipes!

Any program (with appropriate rights) may open the named pipe (fifo) 
for writing. As long as data for one event is written in one big chunk, 
it won't interfere with possible parallel writers. If the fifo device is 
closed by a writer, this does not mean it vanishes, just reopen for 
writing and write next event hunk).




Basically what you describe as a fifo manager sounds more like a bus
like dbus.


It is the old and proven inter process communication mechanism of Unix, 
nothing new.


And fifo manager sounds really big, it is a really minimalistic 
daemon, with primitive operation (does never touch the data in the fifo, 
nor even read that data). It's main purpose beside creating the fifo, is 
to fire up a conf parser / device operation process when required, and 
to react on failure of them (spawn a script with args).


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Michael Conrad


The question I was asking was only about this:

On 3/12/2015 12:04 PM, Harald Becker wrote:
but that one will only work when you either use the kernel hotplug 
helper mechanism, or the netlink approach. You drop out those who 
can't / doesn't want to use either. 


...which I really do think could be answered in one paragraph :-) If the 
netlink socket is the right way to solve the forkbomb problem that 
happens with hotplug helpers, then why would anyone want to solve it the 
wrong way?  I don't understand the need.


-Mike


___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-12 Thread Harald Becker


Hi Laurent !

   Out of curiosity: what are, to you, the benefits of this approach ?

What are the benefits of preferences? ... good question!? ;)



Does it actually save you noticeable amounts of RAM ?


May be a few bytes ... noticeable? what is your level to noticeable 
here? ... otherwise I would say *NO*



of disk space ?


Which disk space? ... which disk? ... looking around ... not seeing a 
disk (in your sense) ... shot: complete system from initramfs ... disk 
in this sense is some external USB data storage (only data, never used 
for system purposes).



Is it about maintenance - just copy one binary file from a system to
another ?


Hit!


(But you'd also have to copy all your scripts...)


Second file, a tar, but all architecture independent,
third file, a tar, with pre-configured setups.


Is it about something else ?


Yes, the most important! ... It's my way, the way I did it for several 
commercial projects ... and the way I like to do it ... clarity / 
simplicity / puristic ... preference! :)




  If it's just for the hacking value, I can totally respect that, but it's
not an argument you can use to justify architectural modifications to a
Unix tool other people are using, because it kinda goes against the Unix
philosophy: one job, one tool. Busybox gets away with it ...


Better to call Busybox a tool set, it is several commands and a library 
linked together to share some code size. Beside the invoking logic of 
the applets (multicall) the applets has to be considered separate 
commands ... though some commands tend to forget that.


I won't try to hide fixed system functionality in a binary (better say 
program or command here), except for fall back operation (e.g. last 
resort handling). My usual approach is to spawn a script when it's time 
to handle system depended things (or need to be under admin control).


... but I like to describe in configuration what to do, not how to do 
(as it's done in scripts). So I like to have simple lists, describing my 
system, let a one-shot command parse that list, and call the required 
programs / commands / scripts with the configured information from the 
lists, to do the job.


e.g.

# required virtual file systems
/proc root:root  0755  %proc
/sys  root:root  0755  %sysfs
/dev  root:root  0755  %tmpfs  size=64k
/dev/pts  root:root  0755  %devpts

(describes my system setup - selected part - without describing how to 
go there)


... and yes that could be done with shell scripting ... the way I'm 
doing it since years ... but still things tend to be scattered around, 
so i liked to add setting up the virtual file systems (excluding /tmp, 
which I setup in fstab) and preparing the device file system (including 
the device descriptions) in one central place (that was mdev.conf). 
Currently I put those lines in comments and filter them out into a shell 
script, but this is sometimes confusing.



(and I believe that inclusion of supervisors and super-servers is already
too much).


ACK ... but what do you think about e.g. tcpsvd (accepting incoming tcp 
connections), or netlink reader?


My fifo watcher does comparable job as tcpsvd (brings me to the idea to 
create it as fifod applet in BB with appropriate options). We could use 
TCP connections, but system cost for fifo's named pipe should be below 
the cost of running through network stack.



  Even on a noMMU system, I think what you'd gain from having one single
userspace binary (instead of multiple small binaries, as I do on my
systems) is negligible when you're running a Linux kernel in the first
place, which needs at least several megabytes even when optimized to
the fullest.


Linux kernel including cpio to setup initramfs is around 6 to 8 MByte on 
modern kernel versions (complete system kernel + system tools + 
application scripts) ... running on a system with 64 MByte or even 16 to 
20 MB. No disks at all. Boot from CD-ROM drive, then turn it of. ... but 
nowadays more likely boot from USB stick :) ... 256 MByte stick :) ... 
vintage ... 32 MB Boot partition, boot loader files + boot images + 
System Config, rest of stick is data storage.




  I know a guy who manages to run almost-POSIX systems in crazy tiny
amounts
of RAM - think 256kB, and the TCP/IP stack takes about 40 kB - but that's a
whole other world, with microcontrollers, JTAG, a specific OS in asm, and
ninja optimization techniques at the expense of maintainability and
practicalness.


Full duplex serial bridging between an PLC bus system and special 
synchronous clocked bus system for hazardous areas with a bit rate of 31 
kbps and manual Manchester code detection and bit shifting, with a 8 bit 
CPU of 8 to 12 MHz, 16 kB EPROM and 1 kByte RAM and a frame size of up 
to 300 Byte in each direction ... and at least one bus side required 
sending packet check sum in the header :( ... microcontroller ... and 
OS? what's that? which OS? first instruction executed by CPU after reset 
at address X ... that was my

[OT] long-lived spawners (was: RFD: Rework/extending functionality of mdev)

2015-03-11 Thread Laurent Bercot


On 11/03/2015 14:02, Denys Vlasenko wrote:

But that nldev process will exist for all time, right? That's not elegant.
Ideally, this respawning logic should be in the kernel.


 Well there is already a kernel-based solution: hotplug.
 Sure, it's not serialized, but it's there. If you want something
serialized, then you have a stream of information you need to get
to the userspace - and at this point, you might as well send it
as is and let userspace sort it out, and that's exactly what the
netlink does.

 Needing daemons to answer notifications from userspace processes
or the kernel is the Unix way. It's not Hurd's, it's not Plan 9's
(AFAIK), but it's what we have, and it's not even that ugly. The
listening and spawning logic will have to be somewhere anyway, so
why not userspace ? Userspace memory is cheaper (because it can
be swapped), userspace processes are safer, and processes are not
a scarce resource.

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread William Haddon

I suppose it's time to dig out code from the secret archives of my 
secret lair again. Someone named Vladimir Dronnikov called this ndev 
and proposed it as a patch to Busybox in 2009. I dug it out and 
separated it from Busybox, and probably made some other changes I don't 
remember, for some nefarious agenda I again don't remember. It is a 
modified version of mdev that seems to do some of the things you all 
have been talking about. I'm not maintaining it in any way, so you all 
are welcome to do whatever you want with it. I have no idea whether it 
works or not, other than that it apparently was once useful to me.

William Haddon

On 03/11/2015 11:21:42 AM, Laurent Bercot wrote:
 On 11/03/2015 15:56, Harald Becker wrote:
  And one point to state clearly: I do not want to go the way to fork
 a
  project (that is the worst expected), but I'm at a point, I like /
  need to have Busybox to allow for some additional or modified
  solutions to fit my preferences
 
   I don't understand that binary choice... You can work on your
 own project without forking Busybox. You can use both Busybox and
 your own project on your systems. Busybox is a set of tools, why
 should it be THE set of tools ?
 
   I'm not sure how heavily mdev [-s] relies on libbb and how hard it
 would be to extract the source and make it into a standalone, easily
 hackable project, but if you want to test architecture ideas, that's
 the way to go - copy stuff and tinker with it until you have 
 something
 satisfying. Then, if upstream wants to integrate your modifications,
 if you find a reasonable compromise to merge, great; if not, you 
 still
 have your harald-mdev, and you can still use it along with Busybox -
 you'll have two binaries instead of one, but even on whatever tiny
 noMMU you can work on, you'll be fine.
 
   That does not preclude design discussions, which I like, and which
 can happen here (unless moderators object), and people like Isaac, 
 me,
 or obviously Denys, wouldn't be so defensive - because it's now about
 your project, not Busybox; you have the final say, and it's totally
 fine, because I don't have to use harald-mdev if I don't want to.
 
   Forks are bad, but alternatives are good.
 
 -- 
   Laurent
 ___
 busybox mailing list
 busybox@busybox.net
 http://lists.busybox.net/mailman/listinfo/busybox
 


/*
 * ndev - hardware detection daemon - based on Busybox's
 * mdev
 *
 * Copyright 2005 Rob Landley r...@landley.net
 * Copyright 2005 Frank Sorenson fr...@tuxrocks.com
 * Copyright 2009 Vladimir Dronnikov dronni...@gmail.com
 * Copyright 2013, 2014 William Haddon will...@haddonthethird.net
 *
 * Licensed under GPL version 2; see the file COPYING for details.
 */

/* Set your tabstop to 4 */

#define _BSD_SOURCE
#include dirent.h
#include errno.h
#include fcntl.h
#include grp.h
#include libgen.h
#include limits.h
#include poll.h
#include pwd.h
#include regex.h
#include signal.h
#include stdarg.h
#include stdio.h
#include stdlib.h
#include string.h
#include syslog.h
#include unistd.h
#include sys/socket.h
#include sys/stat.h
#include sys/types.h

#include linux/netlink.h

#ifndef NDEV_CONF
#define NDEV_CONF /etc/ndev.conf
#endif

#define SCRATCH_SIZE 80

int scan = 0;

void emsg(char *fmt, ...) {
	va_list val;
	va_start(val, fmt);
	vfprintf(stderr, fmt, val);
	va_end(val);
	va_start(val, fmt);
	vsyslog(LOG_ERR, fmt, val);
	va_end(val);
}

void pemsg(char *fmt, ...) {
	va_list val;
	char *err, *nfmt;
	size_t l1, l2;
	l1 = strlen(fmt);
	err = strerror(errno);
	l2 = strlen(err);
	nfmt = malloc(l1 + l2 + 4);
	if (!nfmt)
		nfmt = fmt;
	else
		snprintf(nfmt, l1+l2+4, %s: %s\n, fmt, err);
	va_start(val, fmt);
	vfprintf(stderr, nfmt, val);
	va_end(val);
	va_start(val, fmt);
	vsyslog(LOG_ERR, nfmt, val);
	va_end(val);
	if (nfmt != fmt)
		free(nfmt);
}

char *_strchrnul(char *s, int c) {
	char *r;
	r = strchr(s, c);
	if (!r)
		return s + strlen(s);
	return r;
}

/* Writes an entire buffer to a file descriptor. Returns count on success and
	-1 on error. */
ssize_t full_write(int fd, char *buf, size_t count) {
	ssize_t len;
	ssize_t result;
	len = count;
	while (len) {
		result = write(fd, buf, len);
		if (result  0  errno != EINTR)
			return result;
		if (result  0)
			len -= result;
	}
	return count;
}

/* Reads a given number of bytes from a file descriptor. Returns the number of
	bytes read on success and -1 on error. Less than the full number of bytes
	are read only if the file ends */
ssize_t full_read(int fd, char *buf, size_t count) {
	ssize_t len;
	ssize_t result;
	len = count;
	while (len) {
		result = read(fd, buf, len);
		if (result  0  errno != EINTR)
			return result;
		if (result == 0)
			break;
		if (result  0)
			len -= result;
	}
	return count - len;
}

ssize_t copy_until_eof(int infd, int outfd) {
	char buffer[1024];
	ssize_t result;
	ssize_t len;
	len = 0;
	while (1) {
		result = read(infd, buffer, 1024);
		if (result  0  errno != EINTR)
			return result;
		if

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Harald Becker


Hi Denys !

 mdev rules are complicated already. Adding more cases

needs adding more code, and requires people to learn
new mini-language for mounting filesystems via mdev.

Think about it. You are succumbing to featuritis.
...
This is how all bloated all-in-one disasters start.

Fight that urge. That's not the Unix way.


Yes! ... my failure to mix those things without giving an explanation. 
Laurent pointed me to the better approach and gave the explanation, the 
idea is to have a one-shot point of device system initialization ... 
that means those device system related operations, and may be depp 
system related virtual file systems (proc, sys, e.g.) ... I even exclude 
things like setting up a tmpfs for /tmp from this, also it could be done 
with no extra cost.


My previous message to Natanael and Isaac shall clarify my approach and 
should have been the starting point of this discussion ... which is as 
told at the early phase of brain storming about doing a rework of the 
Busybox related device management system, eliminating the long standing 
problems, giving some more flexibility, and may be adding some more 
functionality of benefit for those who like them (not blocking others).


Though, RFD means request for discussion, not for hacking code in any 
way, before we reached a point of agreement ... and at least some 
problems of mdev has already been grumbled about, as I remember ... so a 
discussion, giving the chance to do the work to overcome those issues 
... Adding (some) extra functionality is to enhance flexibility (where I 
focused on the parts, I'm most concerned with - peoples preferences are 
different), but only part of the work I like to do.


I tried to stay as close as possible at the current mdev.conf syntax, 
with the intention not to break existing setups ... but would not 
neglect, creating a different syntax parser, if this would be the 
outcome of the discussion. I'm open to the results, but like to get a 
solution for my preferences, too. Can't be that a few people dictate how 
the rest of the worlds systems has to handle things ... the other side 
of what you called featuritis (I won't neglect your statement above)!


And one point to state clearly: I do not want to go the way to fork a 
project (that is the worst expected), but I'm at a point, I like / need 
to have Busybox to allow for some additional or modified solutions to 
fit my preferences, as others also have also stated already. I'm 
currently willing and able to do (most of) that work, but if it's not 
welcome, the outcome of the discussion may also be stepping to a 
MyBusybox (or however it will be called).


*Again*: I don't want to start a discussion about forking the project, 
it would be the worst possible outcome of my intention ... I like to get 
a tool set based on BB's principles, but giving more flexibility to fit 
more peoples preferences, without breaking things for others (at least 
the majority)! ... this means critical discussion of every belonging 
topic, but not blocking every new approach and functionality with the 
argument of size constraints or featuritis, due to personal dislikes, 
and then accepting a patch which add several hundred of bytes for 
functionality I expect to be pure nonsense or featuritis.


I apologize for my hard words. I don't want to hurt you or anybody else, 
but you did several decision in the past, which resulted in immense 
frustration to me (and others). With the consequence of even halting 
development of several BB focused projects ... please consider more 
opening to the discussion, based on topics not on pure criticism or 
personal liking (don't want to initiate lengthy philosophical quarrels 
with no practical outcome).


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Harald Becker


Hi William !

On 11.03.2015 17:03, William Haddon wrote:

I suppose it's time to dig out code from the secret archives of my
secret lair again. Someone named Vladimir Dronnikov called this ndev
and proposed it as a patch to Busybox in 2009. I dug it out and
separated it from Busybox, and probably made some other changes I don't
remember, for some nefarious agenda I again don't remember. It is a
modified version of mdev that seems to do some of the things you all
have been talking about. I'm not maintaining it in any way, so you all
are welcome to do whatever you want with it. I have no idea whether it
works or not, other than that it apparently was once useful to me.


May be worth to dig into, or not? ... I'm currently in the phase of 
collecting and discussing functionality, not concerned to code hacking.


Would get much more interesting, if you can give information, in which 
functionality it differs. What was the authors intention of forking 
mdev? And may be, why has it be neglected?


Otherwise: I saved your message and may come back to this when I start 
looking at concrete code, or I'm searching for a specific functionality 
and how it is handled by other developers. So, thanks for information.


--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Michael Conrad


On 3/11/2015 9:30 AM, Harald Becker wrote:
So how can we avoid that unwanted parallelism, but still enable all of 
the above usage scenarios *and* still have a maximum of code sharing 
*and* a minimum of memory usage *without* delaying the average event 
handling too much?


The gathering parts need to acquire the device information, sanitize 
this information, and serialize the event operations to the right 
order. The device node handling part shall receive the information 
from the gathering part(s) (whichever is used) and call the required 
operations, but shall avoid reparsing the conf on every event (speed 
up) *and* drop as much memory usage as possible, when the event system 
is idle.


My idea is a fifo approach. This allows splitting the device 
management functionalities. Nevertheless which approach to gather the 
device information is used, the parser and device handling part can be 
shared (even on a mixed usage scenario). 


Supposing that we have
  * mdev acting as a parallel hotplug handler forked by the kernel
and then add
  * mdevd which reads netlink messages and runs as a daemon

What specifically is the appeal of a third approach which tries to 
re-create the kernel netlink design in user-land using a fifo written 
from forked hotplug helpers?


I'm interested in this thread, but there is too much to read.  Can you 
explain your reason in one concise paragraph?


-Mike
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Harald Becker


Hi Isaac !

 Agreed, whole-heartedly.

I just don't think you're quite thinking through exactly what this means.


In which sense? Let me know, what I got wrong. I'm a human, making 
mistakes, not a perfect machine.




Agreed.
But I would include hotplug daemons under etc.


I used etc. so add any similar type of program you like, including 
hotplug daemons ... but stop, what are hotplug daemons? Do you mean 
daemons like udev? udev uses netlink reading. Otherwise I know about 
hotplug helper programs (current mdev operation), but not about hotplug 
daemons.




The gathering parts need to acquire the device information, sanitize this
information, and serialize the event operations to the right order. The
device node handling part shall receive the information from the gathering
part(s) (whichever is used) and call the required operations, but shall
avoid reparsing the conf on every event (speed up) *and* drop as much memory
usage as possible, when the event system is idle.


That *shall* is where you're forgetting about the first principle you
mentioned (assuming that you mean shall in the normative sense used
in RFCs).


??? Sorry, may be it's I'm not a native English speaker. Can you explain 
me what is wrong? How should it be?




Yes, some people find the hotplugger too slow.
But that doesn't mean that everyone wants a new architecture.


What do you mean with new architecture? Different system setup? Changing 
configuration files?


My first approach was not to change usage of current systems. Except a 
slightly bigger BB, you should not have noted from my modifications. 
Then came Laurent with some questions and suggestions to split of some 
things for clarification, so the changes may result in slightly modified 
applet names and/or parameter usage (still under discussion), to be able 
to adopted all functionality ... but otherwise you won't need to change 
your setup, if you do not like.




Some people, at some points in time, would prefer to use a plain, simple,
hotplugger, regardless of whether it's slow.


??? Didn't you notice the following:

 - using the hotplug handler approach of the kernel
(current operation of mdev)



(Personally, I'd like a faster boot, but after that a hotplugger that
doesn't daemonize is fine by me.)


Do you really like forking a separate conf parser for each hotplug 
event, even if they tend to arrive in bursts?


Won't you like to get a faster system startup with mdev, without 
changing system setup / configuration?




So, in order to respect that preference, it would be nice if you could
let the hotplug part of mdev keep doing what it does.


What expect you to be the hotplug part? The full hotplug handler 
including conf file parser, spawned in parallel for each event?
Won't you like to benefit from a faster system startup, only why you 
fear there is another (minimalistic) daemon sitting in the back? Sounds 
like automobiles are dangerous, I won't use them? ... sorry if this 
sounds bad, I try to understand what exactly you are fearing ... I 
expect you did misunderstand something (or I explained / translated it 
wrong).




My idea is a fifo approach. This allows splitting the device management
functionalities. Nevertheless which approach to gather the device
information is used, the parser and device handling part can be shared (even
on a mixed usage scenario).


I understand that the goal here is to allow people to use netlink or hotplug
interchangeably with mdev -p (which I still think is a poorly named
but very desireable feature).


Please don't stay at those specific parameter names, think of specific 
functionalities. The names are details under discussion ... but here 
especially I expect you misunderstand something. mdev -p would be for 
internal purpose to distinguish invocation from the usual mdev, and 
mdev -s usage (current mdev). So if you won't change your system setup 
to benefit from extra functionality, you won't ever need mdev -p, it 
is for internal usage and special purposes (the p stands for parser, a 
quick and dirty selection, to use something).




As stated before, I don't think that this approach is really functional,
and would be more opposed to using it than to using netlink or a plain
hotplugger. For this reason, I'm opposed to including it in *mdev*.


??? Not functional? In which way? What does you fear?



I also think that those who *do* want to use this approach would benefit
more from a non-busybox binary, since the hotpluggger needs to be as
small and minimal as possible. Hence, I suggest doing it outside busybox.


Yes, that hotplug helper may benefit from being as small and fast as 
possible, but a separate program means a separate binary, and that 
conflict with my one single statical binary preference, which I share 
with others. I consider splitting of that helper in a separate binary is 
under discussion, but otherwise won't change anything on the concept (it 
is not much more than a compile / link question).

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Harald Becker


On 11.03.2015 22:44, Michael Conrad wrote:

What specifically is the appeal of a third approach which tries to
re-create the kernel netlink design in user-land using a fifo written
from forked hotplug helpers?


You mix things a bit.

My approach allows to either using netlink or kernel hotplug method, 
sharing the code and invoking only one instance of the parser / handler 
even on hotplug event bursts. The third method is the initial device 
file system population


Splitting this into different processes is the Unix way of multi 
threading, using fifo (= named pipe) for inter process communication.



___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Harald Becker


Hi Michael !

On 11.03.2015 22:44, Michael Conrad wrote:

I'm interested in this thread, but there is too much to read.  Can you
explain your reason in one concise paragraph?


One paragraph is a bit to short, my English sucks, but I try to 
summarize the intention of my approach in compact steps (a bit more than 
one screen page of text):


- current kernel hotplug based mdev suffers from parallel start and 
reading / scanning the conf on each instance


- the known and proven solution to this is using a netlink reader daemon 
(long lived daemon which stays active all the time)


- there are still people who insist staying on the kernel hotplug 
feature (for whatever reason, so accept we need hotplug + netlink)


- The third method (scanning sys file system) is the operation of mdev 
-s (initial population of device file system, don't mix with mdev w/o 
parameters), in addition some user take this as semi automatic device 
management (embedded world)


- So how can we have all three methods without duplication of code, plus 
how can we speed up the kernel hotplug method?


- answer is to split the gathering parts from the conf file parser and 
device operation, plus let the parser / handler accept many events with 
one invocation (event bursts), plus saving memory when event system is 
idle (parser process exits when idle for some duration)


- kernel hotplug helper could fire up the fifo and a parser / handler 
when there is one required, but this check adds extra delay / cost on 
first / all delivered events


- solution is a minimalistic fifo watcher and parser startup daemon 
(proven Unix concept for on demand N to 1 inter process communication), 
the fifo watcher creates and hold the fifo open, but never touches data 
in the fifo, only startup a parser when required, allows failure 
management when parser sucks


- now the system maintainer can decide which method to use, unwanted 
methods may be opted out on BB config, plus easier embedding BB based 
device management from external programs (include parser, drop methods)


- beside the netlink code, after rework I expect a near 1 : 1 average of 
binary size compared to current code, but less memory usage on event 
bursts (only one parser process), plus speed improvement on event bursts 
(faster system start up when using hotplug)


- other intended functional improvements are for personal preference to 
get the ability of a one-shot device file system startup (single command 
to setup all device stuff, still under full control of admin, no hard 
coded functionality in any binary)


And last: Don't stick on the mdev -... names mentioned, look at the 
intended functionalities, implementation details (names to use) are 
still under discussion.


Hope that was short enough.

--
Harald
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Natanael Copa

On Tue, 10 Mar 2015 17:26:20 +0100
Harald Becker ra...@gmx.de wrote:

  First look at, what to do, then decide how to implement:
 
 mdev needs to do the following steps:
 
 - on startup the sys file system is scanned for device entries

You don't want scan sysfs for devince entries. Instead you want trigger
hotplug events. Something like:

find /sys -type f -name uevent | while read entry; do
echo add  $entry
done

 - as a hotplug handler for each event a process is forked passing 
 information in environment variables

You don't want fork a process for every hotplug event. Instead you want
catch the hotplug events with netlink.
 
 - when using netlink a long lived daemon read event messages from a 
 network socket to assemble same information as the hotplug handler

You don't want collect the same info twice. Instead you want a netlink
listener (long lived daemon) that is minimal. And then you want a
hotplug handler that deals with the events. That is what mdev does. It
even handles MODALIAS events which is not a device entry

 - when all information for an event has been gathered, mdev needs to 
 search it's configuration table for the required entry, and then ...
 
 - ... do the required operations for the device entry

We can simplify what needs to happen at boot with:

- prepare /sys /proc /dev

- set up hotplug handler

- trigger hotplug events


You can currently do that very simple with existing mdev:

  # prepare /sys /proc /dev 
  mount ... /sys ...

  # set up hotplug handler
  echo /sbin/mdev  /proc/sys/kernel/hotplug

  # trigger hotplug events
  find /sys -type f -name uevent | while read entry; do
  echo add  $entry
  done
  

This works but will trigger tons of forks and cannot guarantee that the
events are handled in correct order - serialization problem. I think
mdev has a hack (/dev/mdev.seq) to resolve the serialization problem.

The performance is still bad due to all forks.

To avoid the many forks you need use a netlink listener. This will also
solve the serialization problem properly.

You can solve this by adding netlink support to mdev and turn it into a
daemon. Then the netlink listener and the event handler is in same
executable, similar to what udev does. This has a few drawbacks:

- the handler code will always be running. eg the mdev.conf parsing
  code will be in the daemon all the time. It can be discussed if this
  is a real problem or not because kernel memory manager will handle
  memory usage and reuse inactive memory properly.

- mdev is currently designed to exit. Turning it to a daemon will
  likely require some work for error handling.


I think it is a better idea to leave mdev as a short lived process
which exists when its done. We can do that by separating out the
netlink listener to a separate daemon which forks mdev and send a bulk
of events via pipe - as explained in my previous email.

...

 Both the sys file system scanner and a netlink daemon could easily 
 establish a pipe and then send device commands to the parser.
 The sys file system scanner (startup = mdev -s) can create the pipe, 
 then scan the sysfs and send the commands to the parser. When done,
 the pipe can be closed and after waiting for the parser process just
 exit.

You don't want sysfs scanner to send device commands anywhere. You
simply trigger the hotplug events with echo add  /sys/./uevent
and then will netlink listener deal with it as if it was hotplugged.
 
 The netlink daemon, can establish the netlink socket, then read
 events an sanitize the messages. When there is any message for the
 parser, a pipe is created and messages can be passed to the parser.
 When netlink is idle for some amount of time, it can close the pipe
 and check the child status.

The netlink listener daemon will need to deal with the event handler (or
parser as you call it) dying. I mean, the handler (the parser) could
get some error, out of memory, someone broke mdev.conf or anything that
causes mdev to exit. If the child (the handler/parser) dies a new pipe
needs to be created and the handler/parser needs to be re-forked.

With that in mind, wouldn't it be better to have the timer code in the
handler/parser? When there comes no new messages from pipe within a
given time, the handler/parser just exists.

 Confusion arise only on the hotplug handler part, as here a new
 process is started for every event by the kernel. Forking a pipe to
 send this to the parser would double the overhead. But leaving the
 parser running for some amount of time, would only work with a named
 fifo, startup of the parser when required and adds timeout management
 to the parser ...

named pipes will just make things more complicated.


-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Natanael Copa

On Tue, 10 Mar 2015 17:56:59 +
Isaac Dunham ibid...@gmail.com wrote:

 Now, a comment or two on design of mdev -i and the netlink listener:
 I really *would* like there to be a timeout of some sort; in my
 experience, 1 second may be rather short, and 5 seconds is usually
 ample.

Just one more comment on the default time-out value.

The idea was to reduce the number of forks not completely remove them.
We want make each burst (many evenst within short time) go via
pipe, while we allow the less frequent use fork.

If there are a long delay due to slow devices like USB1 etc, then the
handler will exit for a while and once the delayed event comes back,
the handler will just auto respawn. We will never have more than one
fork per second though.

We don't even need to let it be configurable unless we think that one
fork per second is too many forks.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-11 Thread Natanael Copa

On Tue, 10 Mar 2015 17:56:59 +
Isaac Dunham ibid...@gmail.com wrote:

 On Tue, Mar 10, 2015 at 01:37:41PM +0100, Harald Becker wrote:
  Hi Laurent !
  ... I dislike the idea of integrating early init functions into mdev, 
  because those
  functions are geographically (as in the place where they're performed) 
  close to mdev,
   but they have nothing to do with mdev logically.
  
  Sorry, I don't agree. mdev's purpose is to setup the device file system
  infrastructure, or at least to help with setting up this. Ok, at first leave
  out proc and sysfs, there you may be right, but what about devpts on
  /dev/pts, or /dev/mqueue, etc., and finally the tmpfs for /dev itself. Do
  you really say they do not belong to the device file system infrastructure?
  Or all those symlinks like /dev/fd, /dev/stdin, /dev/stdout, etc., do you
  consider them not to belong to mdev? Not talking about setting owner
  informations and permissions for those entries in the device file system.
  
 
 In my humble opinion, these do not belong to mdev the busybox applet.

+1

...
 
 Factors that make me want this:
 * mdev can noticeably slow down boot (I've set up mdev as hotplug helper
 with Debian, and notice that the resulting initrd takes a second or two
 more than udev to load.) I would assume that loading all the modules
 in Debian results in a fork bomb.

This is why I want a separate pipe for modaliases.

I want disable MODALIAS handling from mdev and add modprobe -i support
with a timeout feature.

 * hotplug is a rare event, but with modern hardware it frequently causes
 a sudden burst of activity.
 A flash drive frequently triggers 5+ events, and plugging in a phone/
 turning on mass storage may cause a dozen or more.
 Even plugging in an SD card may cause 3 or more events.

Exactly. The events comes in bursts. That is why i think it makes sense
to auto spawn the handler, forward the event via a pipe and exit the handler on 
periods with no activity.
 
 The long lived instance probably needs to stay out of busybox to make
 this really minimize memory use, though.

+1

 Certainly it shouldn't be in the same binary as mdev. (Some people
 use multiple busybox binaries, which is probably more optimal.)

Agree.

...

 Now, a comment or two on design of mdev -i and the netlink listener:
 I really *would* like there to be a timeout of some sort; in my
 experience, 1 second may be rather short, and 5 seconds is usually
 ample.
 I also think that the timeout should be adjustable.
 Now, the timeout could be done in at least two ways:
 (a) timeout in mdev -i:
 disable timeout on read (or set timeout for executing rules)
 parse input and execute rules
 at the end of execution, reset timeout for read
 (b) timeout in netlink daemon:
 listen for events
 on event:
   disable timeout
   check state of mdev/pipe
   respawn if needed
   write event to pipe
   check write result; if needed, retry
   reset timeout
 on timeout:
   close pipe and let mdev die when it finishes
 
 Everything but the timeout has to be done anyhow in both mdev and the
 netlink daemon.
 
 In (a), the timeout could be set on the mdev commandline, or in mdev.conf;
 if it is set on the mdev commandline, *that* needs to be specified on
 the netlink daemon's commandline or in its config file.

if you set it on the command line then you need respawn the netlink
listener if you want change it. (or having the netlink listener re-read
a config file on every event. i dont think we want that)

If we use mdev.conf you can just change that file and kill mdev and new
setting is active.

Not that this is a big issue. Just interesting consequence which i had
not thought of.

 In (b), the timeout is specified in the netlink daemon's commandline or
 in its config file.
 
 Somehow, I find myself prefering option (b); managing timeouts seems to
 be the job of a daemon rather than a hotplugger, and it fits logically
 with everything else that the netlink daemon is doing.

I tend to prefer option (a) because the netlink daemon needs to deal
with a killed mdev anyway. If we let mdev handle the timeout then the
netlink daemon only need to check for POLLHUP on the pipefd. This
also has the befit that the long lived daemon becomes smaller.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-10 Thread Harald Becker


Hi Laurent !

 1) Starting up a system with mdev usually involves the same steps to

mount proc, sys and a tmpfs on dev, then adding some subdirectories
to /dev, setting owner, group and permissions, symlinking some
entries and possibly mounting more virtual file systems. This work
need to be done by the controlling startup script, so we reinvent the
wheel in every system setup.


  The thing is, this work doesn't always need to be done, and when it
does, it does whether you use udevd, mdev, or anything else. /dev
preparation is one of the possible building blocks of an init routine,
and it's quite separate from mounting /proc and /sys, and it's also
separate from populating /dev with mdev -s.


You missed the fact, I told everything stays under control of the admin 
or system setup. Nobody shall be forced to do things in a specific way. 
I want just give some extended functionality and flexibility to do this 
setup in an very easy and IMO straight looking way.


If you like/need to do those mounts in a specific way, just put them in 
your script and leave out the mount lines from mdev.conf, otherwise if 
you need to setup those virtual file systems with specific options you 
can specify them in the same line of configuration.




... I dislike the idea of integrating early init functions into mdev, because 
those
functions are geographically (as in the place where they're performed) close to 
mdev,

 but they have nothing to do with mdev logically.

Sorry, I don't agree. mdev's purpose is to setup the device file system 
infrastructure, or at least to help with setting up this. Ok, at first 
leave out proc and sysfs, there you may be right, but what about devpts 
on /dev/pts, or /dev/mqueue, etc., and finally the tmpfs for /dev 
itself. Do you really say they do not belong to the device file system 
infrastructure? Or all those symlinks like /dev/fd, /dev/stdin, 
/dev/stdout, etc., do you consider them not to belong to mdev? Not 
talking about setting owner informations and permissions for those 
entries in the device file system.


Sure, all that can be done with some shell scripts, scattering around 
all that information at different places, or the need to setup and 
read/process different configuration files. My intention is to get this 
information which depend on system setup in a single, relatively simple 
to use or modifiable location. The startup script itself just invokes 
mdev, which gets the information from /etc/mdev.conf, which calls the 
necessary commands to do the operation. That is, it frees the distro 
manager to write script code to parse the system specific information 
into the startup script.


I consider Busybox to be a set of tools, anybody may use to setup a 
system to there own wishes, not to force anybody to do things in the way 
any person may feel being it's the best way to do things, but I want to 
enable functionality for those who like to collect this information and 
put them in a central place. Information usually scattered around and 
hidden deep in the scripts controlling the startup.




MOUNTPOINT  UID:GID  PERMISSIONS  %FSTYPE [=DEVICE] [OPTIONS]


  I can't help but think this somehow duplicates the work of mount with
/etc/fstab. If you really want to integrate the functionality in some
binary instead of scripting it, I think the right way would be to
patch the mount applet so it accepts any file (instead of hardcoding
/etc/fstab), so you could have a early mount points file, distinct
from your /etc/fstab, that you would call mount on before running
mdev -s.


Laurent you didn't look at the examples. I do not want to hard code any 
*functionality* in mdev. I want add extra functionality to the current 
mdev.conf syntax to allow to do some more stuff, which is usually done 
geographically close to calling mdev, and done in so many systems in 
similar matter.


Look at the usual usage of /etc/fstab, on how many systems do you find 
there information about your virtual file systems? Usually fstab is used 
for the disk devices. In addition what about creating the mount points, 
setting owner, group and permissions? This is not done by mount and not 
specified in fstab. So changing anything there would mean to modify 
fstab syntax, possibility breaking other programs and scripts, reading 
and modifying fstab.


Neither do I want to code any special functionality in a binary, nor do 
I try to duplicate the operation of mount. I just want to extend the 
mdev.conf syntax to add simple configuration information for those 
close to mdev operations at a central place, parse this information 
and call the usual commands, e.g. mount with the right options, as shell 
scripts do.


And what else are a few lines, in mdev.conf, describing those mounts, 
other then placing them in a separate early mount points file? ... ok, 
lets go one step further on this. Let us add an include option to 
mdev.conf, which allow to split the mdev configuration in different 
files, and/or place

Re: RFD: Rework/extending functionality of mdev

2015-03-10 Thread Laurent Bercot



 Hi Harald,

 I'm sorry if I came across as dismissive or strongly opposed to the idea.
It was not my intent. My intent, as always, is to try and make sure that
potential new code 1. is worth writing at all, and 2. is designed the
right way. I don't object to discourage you, I object to generate discussion.
Which has been working so far. ;)



You missed the fact, I told everything stays under control of the
admin or system setup. Nobody shall be forced to do things in a
specific way. I want just give some extended functionality and
flexibility to do this setup in an very easy and IMO straight looking
way.


 I understand your point. If I don't modify my mdev.conf, everything
will still work the same and I can script the functionality if I prefer;
but if you prefer centralizing the mounts and other stuff, it will also
be possible to do it in mdev.conf.

 It is very reasonable. The only questions are:
 - what are the costs of adding your functionality to mdev.conf ?
 - are the benefits worth the costs ?

 You may have noticed that I'm very conservative as far as code is
concerned ;) You're saying I like centralized information, the
benefits are obvious to me and I'm answering I prefer scripting, so
I don't see the benefits to your change. It's a debate of taste, and
we'll never get anywhere like this. We need to dig a little more.

 As a preamble, let me just say that if you manage to make your syntax
extensions a compile-time option and I can deactivate them to get the
same mdev binary as I have today, then I have no objection, apart from
the fact that I don't think it's good design - see below.



Sorry, I don't agree. mdev's purpose is to setup the device file
system infrastructure, or at least to help with setting up this. Ok,
at first leave out proc and sysfs, there you may be right, but what
about devpts on /dev/pts, or /dev/mqueue, etc., and finally the tmpfs
for /dev itself. Do you really say they do not belong to the device
file system infrastructure? Or all those symlinks like /dev/fd,
/dev/stdin, /dev/stdout, etc., do you consider them not to belong to
mdev? Not talking about setting owner informations and permissions
for those entries in the device file system.


 OK, now this is interesting.
 I firmly believe that most of our disagreement comes from the fact
that mdev and mdev -s are the same binary, but *not* the same
functionality at all.

 To me, mdev.conf is a configuration file for mdev functionality,
i.e. configuration for a hotplug helper. Since mdev can be invoked on
every hotplug event, it's very important that mdev.conf parsing remain
fast.

 You are saying that you would benefit from user interface improvements
to the /dev *initialization* functionality, i.e. mdev -s. I understand,
and I agree that a one-stop-shop for /dev initialization would be nice.

 My point, which I didn't make clear in my previous message because I was
too busy poking fun at you (my apologies, banter should never override
clarity), is that I find it bad design to bloat the parser for a hotplug
helper configuration file in order to add functionality for a /dev
initialization program, that has *nothing to do* with hotplug helper
configuration.

 The confusion between the two comes from two places:
 - the applet name, of course; I find it unfortunate that mdev and
mdev -s share the same name, and I'd rather have them called mdev
and mdev-init, for instance.
 - the fact that mdev's activity as a hotplug helper is about creating
nodes, fixing permissions, and generally doing stuff around /dev, which
is, as you say, logically close to what you need to do at initialization
time, so at first sight the configuration file looks like a tempting
place to put that initialization time work.

 But I maintain that mdev.conf should be for mdev, not for mdev -s.
mdev -s is just doing the equivalent of calling mdev for every device
that has been created since boot. If you make a special section in
mdev.conf for mdev -s, this 1. blurs the lines even more, which is not
desirable, and 2. bloats the parser for mdev with things it does not
need after the first mdev -s invocation.



Sure, all that can be done with some shell scripts, scattering around
all that information at different places, or the need to setup and
read/process different configuration files. My intention is to get
this information which depend on system setup in a single, relatively
simple to use or modifiable location.


 I agree with that goal. I just don't think mdev.conf is the place to
do it.



The startup script itself just invokes mdev, which gets the information
 from /etc/mdev.conf


 No: the startup script invokes mdev -s, which invokes (or does the
equivalent of invoking) mdev, which gets the information from mdev.conf.
But the functionality you're planning to add is specific to mdev -s:
the actions would be taken by mdev -s but outside of its series of
mdev invocations.



And what else are a few lines, in mdev.conf, describing those mounts,
other

Re: RFD: Rework/extending functionality of mdev

2015-03-10 Thread Isaac Dunham

On Tue, Mar 10, 2015 at 01:37:41PM +0100, Harald Becker wrote:
 Hi Laurent !
 ... I dislike the idea of integrating early init functions into mdev, 
 because those
 functions are geographically (as in the place where they're performed) close 
 to mdev,
  but they have nothing to do with mdev logically.
 
 Sorry, I don't agree. mdev's purpose is to setup the device file system
 infrastructure, or at least to help with setting up this. Ok, at first leave
 out proc and sysfs, there you may be right, but what about devpts on
 /dev/pts, or /dev/mqueue, etc., and finally the tmpfs for /dev itself. Do
 you really say they do not belong to the device file system infrastructure?
 Or all those symlinks like /dev/fd, /dev/stdin, /dev/stdout, etc., do you
 consider them not to belong to mdev? Not talking about setting owner
 informations and permissions for those entries in the device file system.
 

In my humble opinion, these do not belong to mdev the busybox applet.
Do one thing means *method*, not just problem area - just like
cut, head, and tail are separate programs, despite all selecting
portions of a text input based on mathematical criteria.
And mdev takes care of setting up devices by looking at information
from the kernel (via uevent or environment; the two are very similar).

If you don't make this distinction, how do you argue that lspci should
not query the PCI ID database via DNS (-Q), or that it's out of scope
for an init system to fold in the device manager?

Now, if you added support for using busybox as pre-compiled init
scripts, it would be reasonable for an applet to do this.
(Yes, I've heard of such things being done with busybox.)

   ... but here it's different. s6-devd spawns a helper program for
 every event, like /proc/sys/kernel/hotplug would do, but it reads the
 events from the netlink so they're serialized.
 
 Spawning a process for every event produce massive slowdown on system
 startup. My intention is to fork only one process, parse the conf table once
 and then read sanitized events from stdin, scanning the in memory conf table
 and invoking the right operations.

If you use a similar format to uevent (eg, VAR1=VAL1\0VAR2=VAL2\0\0
or something similar) this sounds desireable to me. 
I suppose this could be mdev -i.

However, I'd suggest figuring out how to make environment variable
hooks work the same in mdev -s/-i and just plain mdev first.
Otherwise, mdev -i would be at a disadvantage.
Note: while it's trivial to swap out the environment with each event,
this is probably a Bad Idea for a longer-lived process, due to the
potential for leaks and other chaos.

Factors that make me want this:
* mdev can noticeably slow down boot (I've set up mdev as hotplug helper
with Debian, and notice that the resulting initrd takes a second or two
more than udev to load.) I would assume that loading all the modules
in Debian results in a fork bomb.
* hotplug is a rare event, but with modern hardware it frequently causes
a sudden burst of activity.
A flash drive frequently triggers 5+ events, and plugging in a phone/
turning on mass storage may cause a dozen or more.
Even plugging in an SD card may cause 3 or more events.

   In your design, the long-lived process forks a unique helper that
 reads mdev.conf... so it's not meant to be used with any program
 compatible with /proc/sys/kernel/hotplug - it's only meant to be
 used with mdev. So, why fork at all ? Your temporary instance is
 unnecessary - just have a daemon that parses mdev.conf, then
 listens to the netlink, and handles the events itself.
 
 No, it is not unnecessary. The long lived instance tries to stay at low
 memory usage, then forks and second instance read mdev.conf into the memory
 table. When the event system goes idle, as it is most of the time, the
 second instance dies and memory is freed to the system.
 
When I'm doing something particularly memory intensive (in my case,
*usually* linking a large program), I sometimes stop every service
and program I can.
This has allowed me to build at least one program I could not have built
otherwise.
There may be times when there's memory pressure that the user is unaware
of, but frequently the user controls both memory pressure and hotplug.

The long lived instance probably needs to stay out of busybox to make
this really minimize memory use, though. Certainly it shouldn't be in
the same binary as mdev. (Some people use multiple busybox binaries,
which is probably more optimal.)

   Seriously, udev is a hard problem. udevd is bloated and its configuration
 is too complex; mdev is close to the point where parsing a config file for
 every invocation will become too expensive - I believe that it has not
 reached that point yet, but if you add more stuff to the conf language, it
 will.
 
 Right, adding more to the conf language adds complexity there, but removes
 complexity of the surrounding scripts and collect system specific
 information at a central place (or places), usually hidden

Re: RFD: Rework/extending functionality of mdev

2015-03-10 Thread Harald Becker


Hi,

getting hints and ideas from Laurent and Natanael, I found we can get 
most flexibility, when when try to do some modularization of the steps 
done by mdev.


At fist there are two different kinds of work to deal with:

1) The overall operation and usage of netlink
2) Extending the mdev.conf syntax

Both are independent so first look at the overall operation ... and we 
are currently looking at operation/functionality. This neither means 
they are all separate programs/applets. We may put several 
functionalities in one applet and distinguish by some options. First 
look at, what to do, then decide how to implement:


mdev needs to do the following steps:

- on startup the sys file system is scanned for device entries

- as a hotplug handler for each event a process is forked passing 
information in environment variables


- when using netlink a long lived daemon read event messages from a 
network socket to assemble same information as the hotplug handler


- when all information for an event has been gathered, mdev needs to 
search it's configuration table for the required entry, and then ...


- ... do the required operations for the device entry


That is scanning the sys file system, hotplug event handler and netlink 
event daemon trigger operation of the mdev parser. Forking a conf file 
parser for each event is much overhead on system startup, when many 
event arrive in short amount of time. There we would benefit from a 
single process reading from a pipe, dieing when there are no more events 
and reestablish when new events arrive.


Both the sys file system scanner and a netlink daemon could easily 
establish a pipe and then send device commands to the parser. The parser 
reads mdev.conf once and creates a in memory table, then reads commands 
from the pipe and scans the memory table for the right entry. When EOF 
on pipe read, the parser can exit successfully.


The sys file system scanner (startup = mdev -s) can create the pipe, 
then scan the sysfs and send the commands to the parser. When done, the 
pipe can be closed and after waiting for the parser process just exit.


The netlink daemon, can establish the netlink socket, then read events 
an sanitize the messages. When there is any message for the parser, a 
pipe is created and messages can be passed to the parser. When netlink 
is idle for some amount of time, it can close the pipe and check the 
child status.


Confusion arise only on the hotplug handler part, as here a new process 
is started for every event by the kernel. Forking a pipe to send this to 
the parser would double the overhead. But leaving the parser running for 
some amount of time, would only work with a named fifo, startup of the 
parser when required and adds timeout management to the parser ...


... but ok, do we look at an alternative: Consider a small long lived 
daemon, which create a named fifo and then poll this fifo until data get 
available. On hotplug event a small helper is started, which read it's 
information, serializes then write the command to the fifo and exit. The 
long lived daemon sees the data (but does not read), then forks a parser 
and gives the read end of the pipe to the the parser. The parser reads 
mdev.conv once, then processes commands from the fifo. Now we are at the 
situation where the timeout needs to be checked in the parser. When 
there are no more events on the fifo the parser just dies successfully 
(freeing used memory). This will be detected by the small long lived 
daemon, which check the exit status and can act on failures (e.g. run a 
failure script). On successful exit of the parser the daemon starts 
again waiting for data on the fifo (which he still hold open for reading 
and writing). This way the hotplug helper will benefit from a single run 
parser on startup, but memory used by the conf parser is freed during 
normal system operation. The doubling of the timeout management in 
netlink daemon and parser can be intentional when different timeouts are 
used. Where a small duration for the idle timeout of netlink can be 
chosen, the parser itself does use a higher timeout, which only triggers 
when the hotplug helper method is used.


yes, there are some rough corners, but we are at the phase of brain 
storming. Beside those corners do we get a modular system, which avoid 
respawning/rereading the conf table for every event, but frees memory 
when there are no more events. Even the hotplug helper method will 
benefit, as the helper process can exit, as soon as the command has been 
written to the fifo. The parser reads serialized commands from the pipe 
and process required actions.


May be we should consider using that small parser helper daemon and the 
named fifo in all cases. sys file system scanner, hotplug helper and 
netlink daemon will then just use the fifo. This would even allow to use 
the same fifo to activate the mdev parser from a user space program 
(including single parser start for multiple events).

Re: RFD: Rework/extending functionality of mdev

2015-03-09 Thread Harald Becker


Hi Natanael !

 I am interested in a netlink listener too for 2 reasons:


- serialize the events
- reduce number of forks for perfomance reasons


My primary intentions.


That is, I want to auto fork a daemon which just open the netlink
socket. When events arrive it forks again, creating a pipe. The new
instance read mdev.conf, build a table of rules in memory, then read
hotplug operations from the pipe (send by the first instance). When
there are no more events for more then a few seconds, the first instance
closes the pipe and the second instance exits (freeing the used memory).
On next hotplug event a new pipe / second instance is created.


I have a simlar idea, but slightly different. I'd like to separate the
netlink listener and the event handler.


Ack. After thinking on Laurents message, I came to this, too. Split off 
the netlink part and use a pipe for communication. That can even go 
further, split off the initial sys scanning and hotlink parts from the 
parser and also use the pipe to communicate. Creating an mdev wrapper 
around this, so handling stays as is. This does not mean we need 
separate applets, may be can all include in one mdev applet with 
operation controlled by options. Later I will write a reply to Laurents 
message, going into more detail.



I am thinking of using http://git.r-36.net/nldev/ which basically does the
same thing as s6-devd: minimal daemon that listens on netlink and for
each event it fork/exec mdev.


Ok, this may be a second alternative. As I do not want to reinvent the 
netlink part, I will take a deep look on the possible alternatives and 
try to adapt them for Busybox.



- the mdev pipe fd is added to the poll(2) call so we catch POLLHUP to
   detect mdev timeout. When that happens, set the mdev pipe fd to -1 so
   we know that it needs to be respawned on next kernel event.


Why doing that so complicated? The mdev parser shall just read simple 
device add/remove commands from stdin until EOF, then exit. That's it. 
The netlink part can easily watch how long it is idle and then just 
close the pipe. As soon as more events arrive it creates a new pipe and 
fork another mdev parser. This needs time management and poll in only 
one program. All other code is simple and straight forward. The netlink 
reader, as a long lived daemon, already needs to watch the forked 
processes and act on failures.


... but those are details in implementation and some optimization. I 
agree on the ideas/functionality behind this.



The benifits:
- the netlink listener who needs to be running all times is very
   minimal.


This may be an argument to have the netlink part not linked into BB, but 
does otherwise not change the idea behind this.



- when there are many events within short time, (eg coldplugging), we
   avoid the many forks and gain performance.



- when there are no events, mdev will timeout and exit.


ACK


- busybox mdev does not need set up netlink socket. (less intrusive
   changes in busybox)


Can be done as an alternative so admin may decide if he likes to use 
netlink or hotplug helper. That is, nobody is forced to handle things in 
a special way. BB shall just give the tools to build easy setups.
Unwanted parts/applets may be left out on BB configuration (if size 
matters else default all tools in and let the admin chose).



Then I'd like to do something similar with modprobe:
  - add support to read modalias from stdin and have 1 sec timeout.
  - have nldev to pipe/fork/exec modprobe --stdin on MODALIAS events.


Nice idea. May be with some slight modification for optimization of the 
timeout handling.



That way we can also avoid the many modprobe forks during coldplug.


ACK, so the project needs a wider view. Thanks for pointing me to this.

... later more details in my Reply to Laurents message.

--
Harald

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-09 Thread Natanael Copa

On Sun, 08 Mar 2015 16:10:32 +0100
Harald Becker ra...@gmx.de wrote:

 Hi,
 
 I'm currently in the phase of thinking about extending the functionality 
 of mdev. As here are the experts working with this kind of software, I 
 like to here your ideas before I start hacking the code.

...

 2) I like to use netlink to obtain hotplug information and avoid massive 
 respawning of mdev as hotplug helper when several events arrive quickly.

I am interested in a netlink listener too for 2 reasons:

- serialize the events
- reduce number of forks for perfomance reasons

 That is, I want to auto fork a daemon which just open the netlink 
 socket. When events arrive it forks again, creating a pipe. The new 
 instance read mdev.conf, build a table of rules in memory, then read 
 hotplug operations from the pipe (send by the first instance). When 
 there are no more events for more then a few seconds, the first instance 
 closes the pipe and the second instance exits (freeing the used memory). 
 On next hotplug event a new pipe / second instance is created.

I have a simlar idea, but slightly different. I'd like to separate the
netlink listener and the event handler.

I am thinking of using http://git.r-36.net/nldev/ which basically does the
same thing as s6-devd: minimal daemon that listens on netlink and for
each event it fork/exec mdev.

What I'd like to do is:

change mdev to:
 - be able to read events from stdin. same format as from netlink socket.
 - set a timeout on stdin (1 sec or so by default). when time out is
   reached (no event within a sec) then just exit.

change nldev to:
- have a mdev pipe fd which we forward the kernel events to.
- on kernel event
if mdev_pipe_fd is -1 then:
  create pipe, fork and exec mdev with args to have
  mdev read from stdin (as explained above)
else:
  write the kernel event to the pipe fd

- the mdev pipe fd is added to the poll(2) call so we catch POLLHUP to
  detect mdev timeout. When that happens, set the mdev pipe fd to -1 so
  we know that it needs to be respawned on next kernel event.

The benifits:
- the netlink listener who needs to be running all times is very
  minimal.

- when there are many events within short time, (eg coldplugging), we
  avoid the many forks and gain performance.

- when there are no events, mdev will timeout and exit.

- busybox mdev does not need set up netlink socket. (less intrusive
  changes in busybox)

Then I'd like to do something similar with modprobe:
 - add support to read modalias from stdin and have 1 sec timeout.
 - have nldev to pipe/fork/exec modprobe --stdin on MODALIAS events.

That way we can also avoid the many modprobe forks during coldplug.

-nc
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: RFD: Rework/extending functionality of mdev

2015-03-09 Thread Laurent Bercot


On 09/03/2015 09:41, Natanael Copa wrote:

What I'd like to do is:

change mdev to:
  - be able to read events from stdin. same format as from netlink socket.


 The thing is, the format from the netlink socket is a bit painful to parse;
the hard part of netlink listeners isn't to actually listen to the netlink
(which is as small and easy as a socket() call), but to parse the messages.
If your goal is to keep the netlink stuff out of mdev, you're probably
better off designing a simpler format to pass the events to stdin.
If you don't want to do that, you might as well make mdev listen to the
netlink itself - you'll spare one extra process.

 The information that a hotplug helper needs is actually very simple:
it's a small dictionary like an envp, and /proc/sys/kernel/hotplug forks
the helper with just the right information in its environment.

 So I'm going to suggest the following format: when the netlink listener
gets an event, it sends a sequence of null-terminated VARIABLE=VALUE
strings to the helper's stdin, and the end of the set is notified via
an extra null character.
 Example: HOME=/\0TERM=linux\0ACTION=something\0DEVPATH=/foo\0\0



  - set a timeout on stdin (1 sec or so by default). when time out is
reached (no event within a sec) then just exit.


 I'm still doubtful of the benefits of that approach. mdev is small,
and when it's not doing anything, it's not using many resources. And
you have to commit those resources anyway, since a kernel event might
appear at any time and you don't want to oom because of that - so,
why not simply keep a long-lived helper ?

--
 Laurent

___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

RFD: Rework/extending functionality of mdev

2015-03-08 Thread Harald Becker


Hi,

I'm currently in the phase of thinking about extending the functionality 
of mdev. As here are the experts working with this kind of software, I 
like to here your ideas before I start hacking the code.


I like to focus on the following topics:


1) Starting up a system with mdev usually involves the same steps to 
mount proc, sys and a tmpfs on dev, then adding some subdirectories to 
/dev, setting owner, group and permissions, symlinking some entries and 
possibly mounting more virtual file systems. This work need to be done 
by the controlling startup script, so we reinvent the wheel in every 
system setup.


I like to extend the syntax of mdev.conf with some extra information and 
add code into mdev to allow to do this operation in a more simplified 
way, but still under full control of the system maintainer. Those extra 
entries will only be executed whith mdev -s not during hotplug. The 
syntax has been chosen to not (horribly) break existing mdev.conf setups.


Current major syntax of mdev.conf:

[-][envmatch]device regex uid:gid permissions [...]
[envmatch]@maj[,min1[-min2]] uid:gid permissions [...]
$envvar=regex uid:gid permissions [...]

- Additional syntax to mount (virtual) file systems:

MOUNTPOINT  UID:GID  PERMISSIONS  %FSTYPE [=DEVICE] [OPTIONS]

This rule is triggered by the percent sign indicating the file system type.

This shall create the mount point (if it not exist), set 
owner/group/permissions of the mount point, fork and exec mount -t 
FSTYPE -o OPTIONS DEVICE MOUNTPOINT.  If DEVICE is not specified the 
literal virtual shall be used.


e.g.

# mount virtual file systems
/proc  root:root 0755 %proc
/sysroot:root 0755 %sysfs
/dev   root:root 0755 %tmpfs size=64k,mode=0755
/dev/pts root:root 0755 %devpts

This will do all the required mounting with a single mdev -s 
invocation, even on a system which has nothing else mounted. The old 
behavior to mount the file systems in the calling scripts will still be 
available, just leave out the mount lines from mdev.conf.


- Additional Syntax to add directories and set there owner informations:

DIRNAME/  UID:GID  PERMISSIONS [ LINKNAME]

This rule is triggered by the slash as the last character of the match 
string. It shall create the given directory, where relative names are 
from the expected /dev base.


e.g.

# add required subdirectories to the device file system
loop/   root:root 0755
input/  root:root 0755

Those directories may be created automatically due to other rules, but 
then you can't control there owner informations. The extra rule allows 
to create the subdirectories on startup and set the owner information as 
you like. Later matching device rules will not change this, so you can 
tune the directory and device permissions.


- Additional syntax to add symlinks and set there owner information:

PATHNAME@  UID:GIDLINKNAME

This rule is triggerd by the at sign as last character of the match 
string. Shall add the given PATHNAME as a symlink to LINKNAME, and set 
the owner of the link.


e.g.

# add symbolic links to the device filesystem
fd@  root:root  /proc/fd
stdin@   root:root  fd/0
stdout@  root:root  fd/1
stderr@  root:root  fd/2

- Extending syntax for symlink handling on device nodes:

The current syntax allows to either move the device to a different 
name/location or to move and add a symlink. In some situations you need 
just a symlink pointing to the new device.


DEVICE_REGEX  UID:GID  PERMISSIONS  =NEW_NAME
(old) Moves the new device node to the given location.

DEVICE_REGEX  UID:GID  PERMISSIONS  PATHNAME
(old) Will create the new device node with the name PATHNAME and create 
a symlink with DEVICE_NAME pointing to PATHNAME. Shall remove existing 
symlink and create a new one.


DEVICE_REGEX  UID:GID  PERMISSIONS  PATHNAME
(new) Shall create the new device under it's expected device name, and 
in addition create a symlink of name PATHNAME pointing to the new 
device. Existing symlinks shall not be touched.


e.g. creating a /dev/cdrom symlink for the first cdrom drive

sr[0-9]+  root:cdrom  0775  cdrom
Shall create /dev/sr0 and a symlink /dev/cdrom - /dev/sr0, but will not 
overwrite the symlink for /dev/sr1, etc.


This may be combined with the move option:

DEVICE_REGEX  UID:GID  PERMISSIONS  =NEW_NAME  PATHNAME
Shall move the device to NEW_NAME as expected and then create the 
symlink to that location.


e.g. moving sr0 into subdirectory and adding symlink

sr[0-9]+  root:cdrom  0775  =block/  cdrom
Shall create /dev/block/sr0 and a symlink /dev/cdrom - /dev/block/sr0, 
not changing /dev/cdrom if it already exists.



2) I like to use netlink to obtain hotplug information and avoid massive 
respawning of mdev as hotplug helper when several events arrive quickly. 
That is, I want to auto fork a daemon which just open the netlink 
socket. When events arrive it forks again, creating a pipe. The new 
instance read mdev.conf, build a table of rules in memory, then read 
hotplug

Re: RFD: Rework/extending functionality of mdev

2015-03-08 Thread Laurent Bercot



 Hi Harald !



1) Starting up a system with mdev usually involves the same steps to
mount proc, sys and a tmpfs on dev, then adding some subdirectories
to /dev, setting owner, group and permissions, symlinking some
entries and possibly mounting more virtual file systems. This work
need to be done by the controlling startup script, so we reinvent the
wheel in every system setup.


 The thing is, this work doesn't always need to be done, and when it
does, it does whether you use udevd, mdev, or anything else. /dev
preparation is one of the possible building blocks of an init routine,
and it's quite separate from mounting /proc and /sys, and it's also
separate from populating /dev with mdev -s.
 So I'd say that it's not about reinventing the wheel, it's that
every system has different needs - and a 10-line script is easy enough
to copy if you have several systems with the same needs. I dislike the
idea of integrating early init functions into mdev, because those
functions are geographically (as in the place where they're performed)
close to mdev, but they have nothing to do with mdev logically.



- Additional syntax to mount (virtual) file systems:

MOUNTPOINT  UID:GID  PERMISSIONS  %FSTYPE [=DEVICE] [OPTIONS]


 I can't help but think this somehow duplicates the work of mount with
/etc/fstab. If you really want to integrate the functionality in some
binary instead of scripting it, I think the right way would be to
patch the mount applet so it accepts any file (instead of hardcoding
/etc/fstab), so you could have a early mount points file, distinct
from your /etc/fstab, that you would call mount on before running
mdev -s.



2) I like to use netlink to obtain hotplug information and avoid
massive respawning of mdev as hotplug helper when several events
arrive quickly. That is, I want to auto fork a daemon which just open
the netlink socket.


 So far, what you're talking about already exists:
 http://skarnet.org/software/s6-linux-utils/s6-devd.html



When events arrive it forks again, creating a
pipe. The new instance read mdev.conf, build a table of rules in
memory, then read hotplug operations from the pipe (send by the first
instance).


 ... but here it's different. s6-devd spawns a helper program for
every event, like /proc/sys/kernel/hotplug would do, but it reads the
events from the netlink so they're serialized.

 In your design, the long-lived process forks a unique helper that
reads mdev.conf... so it's not meant to be used with any program
compatible with /proc/sys/kernel/hotplug - it's only meant to be
used with mdev. So, why fork at all ? Your temporary instance is
unnecessary - just have a daemon that parses mdev.conf, then
listens to the netlink, and handles the events itself.

 ... and you just reinvented udevd. Congratulations ! ;)

 Seriously, udev is a hard problem. udevd is bloated and its configuration
is too complex; mdev is close to the point where parsing a config file for
every invocation will become too expensive - I believe that it has not
reached that point yet, but if you add more stuff to the conf language, it
will.
 What we need is a configuration language for udev that's easy to understand
(for a human) and fast to parse (for a machine). More thought needs to be
poured into it - and it's in my plans, but not for the short term. What I'm
sure of, though, is that the fork a helper for every event vs. handle
events in a unique long-lived program debate is the wrong one - it's an
implementation detail, that can be solved *after* proper udev language
design.

--
 Laurent
___
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

96 matches

Mail list logo