Re: [PATCH 1/2] msi: Invert the sense of the MSI enables.

2007-05-25 Thread Jonathan Lundell

On May 24, 2007, at 10:51 PM, Andi Kleen wrote:


Do we have a feel for how much performace we're losing on those
systems which _could_ do MSI, but which will end up defaulting
to not using it?


At least on 10GB ethernet it is a significant difference; you usually
cannot go anywhere near line speed without MSI

I suspect it is visible on high performance / multiple GB NICs too.


Why would that be? As the packet rate goes up and NAPI polling kicks  
in, wouldn't MSI make less and less difference?


I like the fact that MSI gives us finer control over CPU affinity  
than many INTx implementations, but that's a different issue.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] msi: Invert the sense of the MSI enables.

2007-05-25 Thread Jonathan Lundell

On May 24, 2007, at 10:51 PM, Andi Kleen wrote:


Do we have a feel for how much performace we're losing on those
systems which _could_ do MSI, but which will end up defaulting
to not using it?


At least on 10GB ethernet it is a significant difference; you usually
cannot go anywhere near line speed without MSI

I suspect it is visible on high performance / multiple GB NICs too.


Why would that be? As the packet rate goes up and NAPI polling kicks  
in, wouldn't MSI make less and less difference?


I like the fact that MSI gives us finer control over CPU affinity  
than many INTx implementations, but that's a different issue.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Jonathan Lundell

On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:

It's a really good thing, and it means that if somebody shows that  
your
code is flawed in some way (by, for example, making a patch that  
people

claim gets better behaviour or numbers), any *good* programmer that
actually cares about his code will obviously suddenly be very  
motivated to

out-do the out-doer!


"No one who cannot rejoice in the discovery of his own mistakes  
deserves to be called a scholar."


--Don Foster, "literary sleuth", on retracting his attribution of "A  
Funerall Elegye" to Shakespeare (it's more likely John Ford's work).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

2007-04-15 Thread Jonathan Lundell

On Apr 15, 2007, at 10:59 AM, Linus Torvalds wrote:

It's a really good thing, and it means that if somebody shows that  
your
code is flawed in some way (by, for example, making a patch that  
people

claim gets better behaviour or numbers), any *good* programmer that
actually cares about his code will obviously suddenly be very  
motivated to

out-do the out-doer!


No one who cannot rejoice in the discovery of his own mistakes  
deserves to be called a scholar.


--Don Foster, literary sleuth, on retracting his attribution of A  
Funerall Elegye to Shakespeare (it's more likely John Ford's work).

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: x86 TSC time warp puzzle

2005-04-02 Thread Jonathan Lundell
At 3:13 AM -0500 4/2/05, Lee Revell wrote:
On Fri, 2005-04-01 at 23:05 -0800, Pallipadi, Venkatesh wrote:
 It can be SMI happening in the platform. Typically BIOS uses some SMI
 > polling to handle some devices during early boot. Though 500 
microseconds > sounds a bit too high.

Nope, that sounds just about right.  Buggy BIOSes that implement ACPI
via SMM (or so I have been told) can stall the machine for over a
millisecond, this is why some laptops lose timer ticks at HZ=1000.  The
issue is well known by Linux audio users, as it causes big problems for
people who buy laptops for live audio use.
This is a desktop board, and this is well after boot (hours). Also, 
ACPI is disabled in the BIOS.

I suppose I can try to disable SMI via the APIC?
--
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: x86 TSC time warp puzzle

2005-04-02 Thread Jonathan Lundell
At 3:13 AM -0500 4/2/05, Lee Revell wrote:
On Fri, 2005-04-01 at 23:05 -0800, Pallipadi, Venkatesh wrote:
 It can be SMI happening in the platform. Typically BIOS uses some SMI
  polling to handle some devices during early boot. Though 500 
microseconds  sounds a bit too high.

Nope, that sounds just about right.  Buggy BIOSes that implement ACPI
via SMM (or so I have been told) can stall the machine for over a
millisecond, this is why some laptops lose timer ticks at HZ=1000.  The
issue is well known by Linux audio users, as it causes big problems for
people who buy laptops for live audio use.
This is a desktop board, and this is well after boot (hours). Also, 
ACPI is disabled in the BIOS.

I suppose I can try to disable SMI via the APIC?
--
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


x86 TSC time warp puzzle

2005-04-01 Thread Jonathan Lundell
Well, not actually a time warp, though it feels like one.
I'm doing some real-time bit-twiddling in a driver, using the TSC to 
measure out delays on the order of hundreds of nanoseconds. Because I 
want an upper limit on the delay, I disable interrupts around it.

The logic is something like:
local_irq_save
out(set a bit)
t0 = TSC
wait while (t = (TSC - t0)) < delay_time
out(clear the bit)
local_irq_restore
From time to time, when I exit the delay, t is *much* bigger than 
delay_time. If delay_time is, say, 300ns, t is usually no more than 
325ns. But every so often, t can be 2000, or 1, or even much 
higher.

The value of t seems to depend on the CPU involved, The worst case is 
with an Intel 915GV chipset, where t approaches 500 microseconds (!).

This is with ACPI and HT disabled, to avoid confounding interactions. 
I suspected NMI, of course, but I monitored the nmi counter, and 
mostly saw nothing (from time to time a random hit, but mostly not).

The longer delay is real. I can see the bit being set/cleared in the 
pseudocode above on a scope, and when the long delay happens, the bit 
is set for a correspondingly long time.

BTW, the symptom is independent of my IO. I wrote a test case that 
does diddles nothing but reading TSC, and get the same result.

Finally, on some CPUs, at least, the extra delay appears to be 
periodic. The 500us delay happens about every second. On a different 
machine (chipset) it happens at about 5 Hz. And the characteristic 
delay on each type of machine seems consistent.

Any ideas of where to look? Other lists to inquire on?
Thanks.
--
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Minor cleanup and export three functions

2001-07-19 Thread Jonathan Lundell

At 3:03 AM +0100 2001-07-20, Anton Altaparmakov wrote:
>I do appologize. I didn't realize pine would do this. In pine I can just
>read the attachment as text and in Eudora it just appears as inlined
>text without any indication of it being a separate attachment, so I just
>assumed that it was sent clear text. Obviously not. 

Eudora does leave you one little clue:

At 2:19 AM +0100 2001-07-20, Anton Altaparmakov wrote:
>MIME-Version: 1.0
>Content-Type: MULTIPART/MIXED; 
>BOUNDARY="-559023410-1804928587-995591940=:20239"

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Minor cleanup and export three functions

2001-07-19 Thread Jonathan Lundell

At 3:03 AM +0100 2001-07-20, Anton Altaparmakov wrote:
I do appologize. I didn't realize pine would do this. In pine I can just
read the attachment as text and in Eudora it just appears as inlined
text without any indication of it being a separate attachment, so I just
assumed that it was sent clear text. Obviously not. Stupid mailers. Grr.

Eudora does leave you one little clue:

At 2:19 AM +0100 2001-07-20, Anton Altaparmakov wrote:
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; 
BOUNDARY=-559023410-1804928587-995591940=:20239

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-08 Thread Jonathan Lundell

At 3:26 AM -0400 2001-07-08, Alexander Viro wrote:
>On Sat, 7 Jul 2001, Jamie Lokier wrote:
>
>>  Daniel Phillips wrote:
>>  > > Reading a tarball is the distillation of what you describe into
>>  > > efficient form :)
>>  >
>>  > /me downloads tar file definition
>>  >
>>  > Um, gnu tar or posix tar? or some new, improved tar?
>>
>>  I suggest cpio, which is more compact and in some ways more standard.
>>  (tar has a silly pad-to-multiple-of-512-byte per file rule, which is
>>  inappropriate for this).  GNU cpio creates cpio format just fine.
>
>GNU cpio is a race-ridden unmaintained pile of junk. Look at the size
>of, say it, Debian patch to upstream source. Then try to read the
>patched code.  Quite a few of us simply don't have that FPOS on their
>boxen.
>
>Using cpio archive layout is OK, but _please_, don't make it dependent
>on GNU cpio.

If size is an issue (and of course it is), presumably the archive 
would be compressed. As long as tar can be convinced to pad with 
(say) nulls, the padding shouldn't have that much of an impact on 
archive size.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Acpi] Re: ACPI fundamental locking problems

2001-07-08 Thread Jonathan Lundell

At 3:26 AM -0400 2001-07-08, Alexander Viro wrote:
On Sat, 7 Jul 2001, Jamie Lokier wrote:

  Daniel Phillips wrote:
Reading a tarball is the distillation of what you describe into
efficient form :)
  
   /me downloads tar file definition
  
   Um, gnu tar or posix tar? or some new, improved tar?

  I suggest cpio, which is more compact and in some ways more standard.
  (tar has a silly pad-to-multiple-of-512-byte per file rule, which is
  inappropriate for this).  GNU cpio creates cpio format just fine.

GNU cpio is a race-ridden unmaintained pile of junk. Look at the size
of, say it, Debian patch to upstream source. Then try to read the
patched code.  Quite a few of us simply don't have that FPOS on their
boxen.

Using cpio archive layout is OK, but _please_, don't make it dependent
on GNU cpio.

If size is an issue (and of course it is), presumably the archive 
would be compressed. As long as tar can be convinced to pad with 
(say) nulls, the padding shouldn't have that much of an impact on 
archive size.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] proc_file_read() (Was: Re: proc_file_read() question)

2001-06-27 Thread Jonathan Lundell

At 10:07 AM +0200 2001-06-27, Martin Wilck wrote:
>On Tue, 26 Jun 2001, Jonathan Lundell wrote:
>
>>  I use the hack myself, to implement a record-oriented file where the
>>  file position is a record number. I could probably live with
>>  PAGE_SIZE, but the current hack works fine with start bigger than
>>  that, and it's possible that someone counts on it.
>
>Ok, let's use PAGE_OFFSET instead of PAGE_SIZE, then (see new patch
>below).
>Unless I'm mislead, legitimate values of "start" as a pointer are always
>larger than that, and I can hardly imagin e a case where the "unsigned
>int" value of start must be greater than PAGE_OFFSET.


PAGE_OFFSET definitely works for me, but a quick scan of the headers 
suggests that non-sun3 m68k builds define PAGE_OFFSET as 0, as does 
s390.

Maybe you want max(PAGE_SIZE, PAGE_OFFSET).

>I insist that relying on the comparison of two pointers is the wrong
>thing. If (as you suggest) the major use of "start" has migrated from the
>original intention to that of the "hack", this should be reflected
>in the interface by making the "start" parameter to read_proc ()
>an unsigned long. Everything else is misleading and error-prone.
>For now, "start" is a char* and should be treated as such.

That's the hack, though. Rusty should chime in, but the implicit 
restriction on start in the original hack (by the time we get to the 
test we're talking about) is that it's either a pointer of the form 
page+offset, where offset < PAGE_SIZE, or it's a (relatively) small 
file offset.

That's a reasonable assumption given that the procedure is 
dynamically allocating page. After all, why would you allocate the 
buffer and then not use it?

Sure, the overloading is self-admittedly hacky, but (again I assume) 
the motivation was to avoid breaking the clients, many of which are 
not in the kernel.org tree. Your proposed change overloads a third 
interpretation on start, namely an arbitrary pointer, outside the 
page allocation.

>  > But if you're allocating your own buffer, you'd probably be better
>>  off writing your own file ops, and not using the default
>>  proc_file_read() at all. At the very least you'd save a redundant
>>  __get_free_page/free_page pair.
>
>That's right, but nevertheless (repeat) comparing "start" and "page" is
>wrong.

Not given the implied restriction that, if start is a pointer at all, 
it's a pointer within page's allocation. And after all, PAGE_OFFSET 
is effectively a pointer.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] proc_file_read() (Was: Re: proc_file_read() question)

2001-06-27 Thread Jonathan Lundell

At 10:07 AM +0200 2001-06-27, Martin Wilck wrote:
On Tue, 26 Jun 2001, Jonathan Lundell wrote:

  I use the hack myself, to implement a record-oriented file where the
  file position is a record number. I could probably live with
  PAGE_SIZE, but the current hack works fine with start bigger than
  that, and it's possible that someone counts on it.

Ok, let's use PAGE_OFFSET instead of PAGE_SIZE, then (see new patch
below).
Unless I'm mislead, legitimate values of start as a pointer are always
larger than that, and I can hardly imagin e a case where the unsigned
int value of start must be greater than PAGE_OFFSET.


PAGE_OFFSET definitely works for me, but a quick scan of the headers 
suggests that non-sun3 m68k builds define PAGE_OFFSET as 0, as does 
s390.

Maybe you want max(PAGE_SIZE, PAGE_OFFSET).

I insist that relying on the comparison of two pointers is the wrong
thing. If (as you suggest) the major use of start has migrated from the
original intention to that of the hack, this should be reflected
in the interface by making the start parameter to read_proc ()
an unsigned long. Everything else is misleading and error-prone.
For now, start is a char* and should be treated as such.

That's the hack, though. Rusty should chime in, but the implicit 
restriction on start in the original hack (by the time we get to the 
test we're talking about) is that it's either a pointer of the form 
page+offset, where offset  PAGE_SIZE, or it's a (relatively) small 
file offset.

That's a reasonable assumption given that the procedure is 
dynamically allocating page. After all, why would you allocate the 
buffer and then not use it?

Sure, the overloading is self-admittedly hacky, but (again I assume) 
the motivation was to avoid breaking the clients, many of which are 
not in the kernel.org tree. Your proposed change overloads a third 
interpretation on start, namely an arbitrary pointer, outside the 
page allocation.

   But if you're allocating your own buffer, you'd probably be better
  off writing your own file ops, and not using the default
  proc_file_read() at all. At the very least you'd save a redundant
  __get_free_page/free_page pair.

That's right, but nevertheless (repeat) comparing start and page is
wrong.

Not given the implied restriction that, if start is a pointer at all, 
it's a pointer within page's allocation. And after all, PAGE_OFFSET 
is effectively a pointer.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [comphist] Re: Microsoft and Xenix.

2001-06-26 Thread Jonathan Lundell

At 10:44 AM -0400 2001-06-26, Rob Landley wrote:
>"A quarter century of unix" mentions RK05 cartridges several times, but never
>says much ABOUT them.
>
>Okay, so they're 2.4 megabyte removable cartridges?  How big?  Are they tapes
>or disk packs?  (I.E. can you run off of them or are they just storage?)  I
>know lots of early copies of unix were sent out from Bell Labs on RK05
>cartidges signed "love, ken"...

http://www.pdp8.net/rk05/rk05.shtml

>What was that big reel to reel tape they always show in movies, anyway?

The big-refrigerator-sized guys were generally attached to 
mainframes, IBM or otherwise. Here's a little info: 
http://www.digital-interact.co.uk/site/html/reference/media_9trk.html 
(but take it with a grain of salt; IBM surely didn't go to nine 
tracks because of ASCII!).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] proc_file_read() (Was: Re: proc_file_read() question)

2001-06-26 Thread Jonathan Lundell

At 7:14 PM +0200 2001-06-26, Martin Wilck wrote:
>Hi,
>
>>  Shhh ;-)  Last time that hack was mentioned, someone wanted to _remove_
>>  it.  It's a very nice little hack to have around, and IKD uses it.
>
>I am not saying it should be removed. But IMO it is a legitimate (if
>not the originally intended) use of "start" to serve as a pointer to
>a memory area allocated in the proc_read () function. This use is broken
>with this hack in its current form, because reading from such a file
>will fail depending on the (random) order of the page and start pointers.
>
>If I understand the "hack" right, legitimate offsets generated for it
>are always between 0 and PAGE_SIZE. Therefore the patch below would
>not break it, while overcoming the abovementioned problem, because
>legitimate page pointers will never be < PAGE_SIZE.
>
>Please correct me if I'm wrong.

I use the hack myself, to implement a record-oriented file where the 
file position is a record number. I could probably live with 
PAGE_SIZE, but the current hack works fine with start bigger than 
that, and it's possible that someone counts on it.

But if you're allocating your own buffer, you'd probably be better 
off writing your own file ops, and not using the default 
proc_file_read() at all. At the very least you'd save a redundant 
__get_free_page/free_page pair.

>Cheers,
>Martin
>
>--
>Martin Wilck <[EMAIL PROTECTED]>
>FSC EP PS DS1, Paderborn  Tel. +49 5251 8 15113
>
>
>--- linux-2.4.5/fs/proc/generic.c  Mon Jun 25 13:46:26 2001
>+++ 2.4.5mw/fs/proc/generic.c  Tue Jun 26 20:42:22 2001
>@@ -104,14 +104,14 @@
>* return the bytes, and set `start' to the desired offset
>* as an unsigned int. - [EMAIL PROTECTED]
>*/
>-  n -= copy_to_user(buf, start < page ? page : start, n);
>+  n -= copy_to_user(buf, (unsigned long) start < 
>PAGE_SIZE ? page : start, n);
>   if (n == 0) {
>   if (retval == 0)
>   retval = -EFAULT;
>   break;
>   }
>
>-  *ppos += start < page ? (long)start : n; /* Move down 
>the file */
>+  *ppos += (unsigned long) start < PAGE_SIZE ? 
>(unsigned long) start : n; /* Move down the file */
>   nbytes -= n;
>   buf += n;
>   retval += n;

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Re: When the FUD is all around (sniff).

2001-06-26 Thread Jonathan Lundell

At 4:02 PM +0100 2001-06-26, Alan Cox wrote:
>  > > There is a saying in he UK 'You can fool all of the people some of the
>>  > time, you can fool some of the people all the time, but you 
>>cannot fool all
>>  > of the people all of the time'.
>>
>>  Didn't Abraham Lincoln say that?  :)
>
>[Digs]
>Indeed in 1864.

Perhaps, perhaps not.

http://www.usnews.com/usnews/issue/970217/17linc.htm

>What Zall did with the plethora of Lincoln anecdotes--include and 
>evaluate the apparently authentic, delete the seemingly 
>apocryphal--other historians are doing with collections of his 
>words. Their task is daunting: No American is more quoted--or 
>misquoted--than Lincoln. Their work also is important: The image of 
>Lincoln, the historical as well as the mythical, has been shaped to 
>an uncommon degree by statements that other people put in his mouth, 
>often to suit their own purposes.
>
>Stanford's Don Fehrenbacher and his wife, Virginia, spent 12 years 
>compiling the Recollected Words of Abraham Lincoln (Stanford 
>University Press, 1996, $60), a collection of 1,900 quotations 
>attributed to Lincoln by more than 500 of his contemporaries. The 
>scholars rated the authenticity of quotations with letter grades: A 
>for a direct quote the listener wrote down soon after hearing it; B 
>for a quickly recorded indirect quote; C for quotes reported weeks, 
>months, or years later; D for one "about whose authenticity there is 
>more than average doubt"; E for those "probably not authentic."
>
>No fooling. One now familiar line the Fehrenbachers examined was far 
>from familiar to 19th-century America: "You can fool all the people 
>some of the time and some of the people all of the time, but you 
>can't fool all the people all of the time." The saying apparently 
>first emerged in print in 1901 in Lincoln's Yarns and Stories; the 
>book identified the person who allegedly heard Lincoln as "a caller 
>at the White House." Years later, two old-timers claimed they had 
>heard Lincoln say it in an 1856 address in Illinois, but a news 
>account of the speech didn't mention it. The Fehrenbachers give the 
>old-timers' recollections a D. The evidence, the scholars say, 
>"suggests that this is a case of reminiscence echoing folklore or 
>fiction."


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[OT] Re: When the FUD is all around (sniff).

2001-06-26 Thread Jonathan Lundell

At 8:59 AM -0600 2001-06-26, Jordan Crouse wrote:
>  > There is a saying in he UK 'You can fool all of the people some of the
>>  time, you can fool some of the people all the time, but you cannot fool all
>>  of the people all of the time'.
>
>Didn't Abraham Lincoln say that?  :)

That's the common, but doubtful, attribution.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[OT] Re: When the FUD is all around (sniff).

2001-06-26 Thread Jonathan Lundell

At 8:59 AM -0600 2001-06-26, Jordan Crouse wrote:
   There is a saying in he UK 'You can fool all of the people some of the
  time, you can fool some of the people all the time, but you cannot fool all
  of the people all of the time'.

Didn't Abraham Lincoln say that?  :)

That's the common, but doubtful, attribution.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] proc_file_read() (Was: Re: proc_file_read() question)

2001-06-26 Thread Jonathan Lundell

At 7:14 PM +0200 2001-06-26, Martin Wilck wrote:
Hi,

  Shhh ;-)  Last time that hack was mentioned, someone wanted to _remove_
  it.  It's a very nice little hack to have around, and IKD uses it.

I am not saying it should be removed. But IMO it is a legitimate (if
not the originally intended) use of start to serve as a pointer to
a memory area allocated in the proc_read () function. This use is broken
with this hack in its current form, because reading from such a file
will fail depending on the (random) order of the page and start pointers.

If I understand the hack right, legitimate offsets generated for it
are always between 0 and PAGE_SIZE. Therefore the patch below would
not break it, while overcoming the abovementioned problem, because
legitimate page pointers will never be  PAGE_SIZE.

Please correct me if I'm wrong.

I use the hack myself, to implement a record-oriented file where the 
file position is a record number. I could probably live with 
PAGE_SIZE, but the current hack works fine with start bigger than 
that, and it's possible that someone counts on it.

But if you're allocating your own buffer, you'd probably be better 
off writing your own file ops, and not using the default 
proc_file_read() at all. At the very least you'd save a redundant 
__get_free_page/free_page pair.

Cheers,
Martin

--
Martin Wilck [EMAIL PROTECTED]
FSC EP PS DS1, Paderborn  Tel. +49 5251 8 15113


--- linux-2.4.5/fs/proc/generic.c  Mon Jun 25 13:46:26 2001
+++ 2.4.5mw/fs/proc/generic.c  Tue Jun 26 20:42:22 2001
@@ -104,14 +104,14 @@
* return the bytes, and set `start' to the desired offset
* as an unsigned int. - [EMAIL PROTECTED]
*/
-  n -= copy_to_user(buf, start  page ? page : start, n);
+  n -= copy_to_user(buf, (unsigned long) start  
PAGE_SIZE ? page : start, n);
   if (n == 0) {
   if (retval == 0)
   retval = -EFAULT;
   break;
   }

-  *ppos += start  page ? (long)start : n; /* Move down 
the file */
+  *ppos += (unsigned long) start  PAGE_SIZE ? 
(unsigned long) start : n; /* Move down the file */
   nbytes -= n;
   buf += n;
   retval += n;

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [comphist] Re: Microsoft and Xenix.

2001-06-26 Thread Jonathan Lundell

At 10:44 AM -0400 2001-06-26, Rob Landley wrote:
A quarter century of unix mentions RK05 cartridges several times, but never
says much ABOUT them.

Okay, so they're 2.4 megabyte removable cartridges?  How big?  Are they tapes
or disk packs?  (I.E. can you run off of them or are they just storage?)  I
know lots of early copies of unix were sent out from Bell Labs on RK05
cartidges signed love, ken...

http://www.pdp8.net/rk05/rk05.shtml

What was that big reel to reel tape they always show in movies, anyway?

The big-refrigerator-sized guys were generally attached to 
mainframes, IBM or otherwise. Here's a little info: 
http://www.digital-interact.co.uk/site/html/reference/media_9trk.html 
(but take it with a grain of salt; IBM surely didn't go to nine 
tracks because of ASCII!).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [OT] Re: When the FUD is all around (sniff).

2001-06-26 Thread Jonathan Lundell

At 4:02 PM +0100 2001-06-26, Alan Cox wrote:
There is a saying in he UK 'You can fool all of the people some of the
   time, you can fool some of the people all the time, but you 
cannot fool all
   of the people all of the time'.

  Didn't Abraham Lincoln say that?  :)

[Digs]
Indeed in 1864.

Perhaps, perhaps not.

http://www.usnews.com/usnews/issue/970217/17linc.htm

What Zall did with the plethora of Lincoln anecdotes--include and 
evaluate the apparently authentic, delete the seemingly 
apocryphal--other historians are doing with collections of his 
words. Their task is daunting: No American is more quoted--or 
misquoted--than Lincoln. Their work also is important: The image of 
Lincoln, the historical as well as the mythical, has been shaped to 
an uncommon degree by statements that other people put in his mouth, 
often to suit their own purposes.

Stanford's Don Fehrenbacher and his wife, Virginia, spent 12 years 
compiling the Recollected Words of Abraham Lincoln (Stanford 
University Press, 1996, $60), a collection of 1,900 quotations 
attributed to Lincoln by more than 500 of his contemporaries. The 
scholars rated the authenticity of quotations with letter grades: A 
for a direct quote the listener wrote down soon after hearing it; B 
for a quickly recorded indirect quote; C for quotes reported weeks, 
months, or years later; D for one about whose authenticity there is 
more than average doubt; E for those probably not authentic.

No fooling. One now familiar line the Fehrenbachers examined was far 
from familiar to 19th-century America: You can fool all the people 
some of the time and some of the people all of the time, but you 
can't fool all the people all of the time. The saying apparently 
first emerged in print in 1901 in Lincoln's Yarns and Stories; the 
book identified the person who allegedly heard Lincoln as a caller 
at the White House. Years later, two old-timers claimed they had 
heard Lincoln say it in an 1856 address in Illinois, but a news 
account of the speech didn't mention it. The Fehrenbachers give the 
old-timers' recollections a D. The evidence, the scholars say, 
suggests that this is a case of reminiscence echoing folklore or 
fiction.


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Q serial.c

2001-06-22 Thread Jonathan Lundell

At 9:51 AM -0400 2001-06-22, Stuart MacDonald wrote:
>From: "kees" <[EMAIL PROTECTED]>
>>  What may happen on a SMP machine if a serial port has been closed and the
>>  closing stage is at shutdown() in serial.c in the call to free_IRQ  and
>>  BEFORE the IRQ is really shutdown, a new character arrives which causes an
>>  IRQ? Is it possible that the OTHER cpu  takes this interrupt and causes a
>>  crash?
>
>I'm looking at serial-5.05/serial.c. You'll notice at the
>beginning of shutdown the saveflags(); cli(); calls.
>This disables interrupts. The uart will not be able to
>generate IRQs even if new characters arrive.

The other CPU servicing the interrupt, was the question. cli() 
doesn't affect that. This could presumably happen if shutdown() gets 
run on a non-interrupt-servicing CPU, or if interrupts are 
dynamically routed (eg round-robin).

Where can I find the 5.05 driver?
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mktime in include/linux

2001-06-22 Thread Jonathan Lundell

At 1:43 PM +0200 2001-06-22, Erik Mouw wrote:
>On Thu, Jun 21, 2001 at 10:30:40PM -0400, Rick Hohensee wrote:
>>  Why does Linux have a mktime routine fully coded in linux/time.h that
>>  conflicts directly with the ANSI C standard library routine of the same
>>  name? It breaks a couple things against libc5, including gcc 3.0. OK, you
>>  don't care about libc5. It's still pretty weird. Wierd? Weird.
>
>This has been brought up many times on this list: you are not supposed
>to include kernel headers in userland.

That's not the problem, I think. Most of time.h, including the 
definition of mktime, is #ifdef __KERNEL__, so it shouldn't be 
breaking anything in userland even if you do include it. And you 
might, in order to obtain the interface definition of struct 
timespec. What's weird is: why is __KERNEL__ getting #defined in 
Rick's userland?

There can't, of course, be any blanket prohibition against using 
kernel headers in userland. Think about ioctl.h, for example.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mktime in include/linux

2001-06-22 Thread Jonathan Lundell

At 1:43 PM +0200 2001-06-22, Erik Mouw wrote:
On Thu, Jun 21, 2001 at 10:30:40PM -0400, Rick Hohensee wrote:
  Why does Linux have a mktime routine fully coded in linux/time.h that
  conflicts directly with the ANSI C standard library routine of the same
  name? It breaks a couple things against libc5, including gcc 3.0. OK, you
  don't care about libc5. It's still pretty weird. Wierd? Weird.

This has been brought up many times on this list: you are not supposed
to include kernel headers in userland.

That's not the problem, I think. Most of time.h, including the 
definition of mktime, is #ifdef __KERNEL__, so it shouldn't be 
breaking anything in userland even if you do include it. And you 
might, in order to obtain the interface definition of struct 
timespec. What's weird is: why is __KERNEL__ getting #defined in 
Rick's userland?

There can't, of course, be any blanket prohibition against using 
kernel headers in userland. Think about ioctl.h, for example.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Q serial.c

2001-06-22 Thread Jonathan Lundell

At 9:51 AM -0400 2001-06-22, Stuart MacDonald wrote:
From: kees [EMAIL PROTECTED]
  What may happen on a SMP machine if a serial port has been closed and the
  closing stage is at shutdown() in serial.c in the call to free_IRQ  and
  BEFORE the IRQ is really shutdown, a new character arrives which causes an
  IRQ? Is it possible that the OTHER cpu  takes this interrupt and causes a
  crash?

I'm looking at serial-5.05/serial.c. You'll notice at the
beginning of shutdown the saveflags(); cli(); calls.
This disables interrupts. The uart will not be able to
generate IRQs even if new characters arrive.

The other CPU servicing the interrupt, was the question. cli() 
doesn't affect that. This could presumably happen if shutdown() gets 
run on a non-interrupt-servicing CPU, or if interrupts are 
dynamically routed (eg round-robin).

Where can I find the 5.05 driver?
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Controversy over dynamic linking -- how to end the panic

2001-06-21 Thread Jonathan Lundell

At 8:06 PM +0100 2001-06-21, Alan Cox wrote:
>  > > the stdio.h, I'd tell him to go screw himself.
>>  What is the difference between including kernel header file and
>>  including GPLed header file?
>
>There are real differences between programs and interface definitions. At this
>point you get into law and the like and its probably best you read up on it
>from a reputable source not l/k

Though header files don't fall clearly on the interface-definition 
side of the line. ctype.h, for example, in userland, or any other 
header with #defined or inline code.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Controversy over dynamic linking -- how to end the panic

2001-06-21 Thread Jonathan Lundell

At 8:06 PM +0100 2001-06-21, Alan Cox wrote:
the stdio.h, I'd tell him to go screw himself.
  What is the difference between including kernel header file and
  including GPLed header file?

There are real differences between programs and interface definitions. At this
point you get into law and the like and its probably best you read up on it
from a reputable source not l/k

Though header files don't fall clearly on the interface-definition 
side of the line. ctype.h, for example, in userland, or any other 
header with #defined or inline code.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-19 Thread Jonathan Lundell

At 9:09 AM -0700 2001-06-19, Larry McVoy wrote:
>Don't you think it is funny that Sun doesn't publish numbers comparing
>their thread performance to process performance?  Sure, you can find
>context switch benchmarks where they have user level switching going on
>but those are a red herring.  The real numbers you want are the kernel
>level context switches and those are just as expensive as the process
>context switch numbers.

Sun (or at least SPARC) is a bit of a special case, though. SPARC's 
register-window architecture makes thread-switching (not to mention 
recursion) significantly more expensive than on most other 
architectures.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Alan Cox quote? (was: Re: accounting for threads)

2001-06-19 Thread Jonathan Lundell

At 9:09 AM -0700 2001-06-19, Larry McVoy wrote:
Don't you think it is funny that Sun doesn't publish numbers comparing
their thread performance to process performance?  Sure, you can find
context switch benchmarks where they have user level switching going on
but those are a red herring.  The real numbers you want are the kernel
level context switches and those are just as expensive as the process
context switch numbers.

Sun (or at least SPARC) is a bit of a special case, though. SPARC's 
register-window architecture makes thread-switching (not to mention 
recursion) significantly more expensive than on most other 
architectures.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: any good diff merging utility?

2001-06-17 Thread Jonathan Lundell

At 2:34 AM +0200 2001-06-18, Ivan Vadovic wrote:
>Very often the case is that they indeed can be merged automagically.
>For example two patches inserting few lines right after the #include
>lines.
>
>patch1:
>@@ 10,1 10,2 @@
>  #include 
>+#include <1.h>
>
>patch2:
>@@ 10,1 10,2 @@
>  #include 
>+#include <2.h>
>
>The patch will fail to patch :-). But there is no real conflict between
>the patches.

Problem is, you can't tell automatically. Even if the diffs don't 
conflict physically, it's entirely possible that they conflict 
logically.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: any good diff merging utility?

2001-06-17 Thread Jonathan Lundell

At 2:34 AM +0200 2001-06-18, Ivan Vadovic wrote:
Very often the case is that they indeed can be merged automagically.
For example two patches inserting few lines right after the #include
lines.

patch1:
@@ 10,1 10,2 @@
  #include foo.h
+#include 1.h

patch2:
@@ 10,1 10,2 @@
  #include foo.h
+#include 2.h

The patch will fail to patch :-). But there is no real conflict between
the patches.

Problem is, you can't tell automatically. Even if the diffs don't 
conflict physically, it's entirely possible that they conflict 
logically.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-14 Thread Jonathan Lundell

At 10:14 AM -0400 2001-06-14, Jeff Garzik wrote:
>According to the PCI spec it is -impossible- to have more than 256 buses
>on a single "hose", so you simply have to implement multiple hoses, just
>like Alpha (and Sparc64?) already do.  That's how the hardware is forced
>to implement it...

That's right, of course. A small problem is that dev->slot_name 
becomes ambiguous, since it doesn't have any hose identification. Nor 
does it have any room for the hose id; it's fixed at 8 chars, and 
fully used (bb:dd.f\0).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Going beyond 256 PCI buses

2001-06-14 Thread Jonathan Lundell

At 10:14 AM -0400 2001-06-14, Jeff Garzik wrote:
According to the PCI spec it is -impossible- to have more than 256 buses
on a single hose, so you simply have to implement multiple hoses, just
like Alpha (and Sparc64?) already do.  That's how the hardware is forced
to implement it...

That's right, of course. A small problem is that dev-slot_name 
becomes ambiguous, since it doesn't have any hose identification. Nor 
does it have any room for the hose id; it's fixed at 8 chars, and 
fully used (bb:dd.f\0).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Configure.help is complete

2001-06-01 Thread Jonathan Lundell

At 2:59 PM +0200 2001-06-01, David Weinehall wrote:
>  > Not to open a what may be can of worms but ...
>>
>>  What's wrong with procfs?
>
>Imho, a procfs should be for process-information, nothing else.
>The procfs in its current form, while useful, is something horrible
>that should be taken out on the backyard and shot using slugs.
>
>Ehrmmm. No, but seriously, the non-process stuff should be separate
>from the procfs. Maybe call it kernfs or whatever.
>
>>  It allows a general interface to the kernel that does not require new
>>  syscalls/ioctls and can be accessed from user space without specifically
>>  compiled programs. You can use shell scripts, java, command line etc.
>
>Yes, and it's also totally non standardised.

It clearly fills a need, though, and has the distinct side benefit of 
cutting down on the proliferation of ioctls. Sure, it's non-standard 
and a mess. But it's semi-documented, easy to use, and v. general. 
What's the preferred alternative, to state the first question another 
way? For any single small project/driver, creating a new fs simply 
isn't going to happen.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Configure.help is complete

2001-06-01 Thread Jonathan Lundell

At 2:59 PM +0200 2001-06-01, David Weinehall wrote:
   Not to open a what may be can of worms but ...

  What's wrong with procfs?

Imho, a procfs should be for process-information, nothing else.
The procfs in its current form, while useful, is something horrible
that should be taken out on the backyard and shot using slugs.

Ehrmmm. No, but seriously, the non-process stuff should be separate
from the procfs. Maybe call it kernfs or whatever.

  It allows a general interface to the kernel that does not require new
  syscalls/ioctls and can be accessed from user space without specifically
  compiled programs. You can use shell scripts, java, command line etc.

Yes, and it's also totally non standardised.

It clearly fills a need, though, and has the distinct side benefit of 
cutting down on the proliferation of ioctls. Sure, it's non-standard 
and a mess. But it's semi-documented, easy to use, and v. general. 
What's the preferred alternative, to state the first question another 
way? For any single small project/driver, creating a new fs simply 
isn't going to happen.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Jonathan Lundell

At 1:38 AM +0100 2001-05-31, Joel Becker wrote:
>On Wed, May 30, 2001 at 05:24:37PM -0700, Jonathan Lundell wrote:
>>  FWIW (perhaps not much in this context), the POSIX way is 
>>sysconf(_SC_CLK_TCK)
>>
>>  POSIX sysconf is pretty useful for this kind of thing (not just HZ, either).
>
>   Well, how many hundred things on Linux are available from /proc
>but not from sysconf or the like?  :-)
>
>Joel

Lots. Maybe we oughta have /proc/sysconf/... (there's no reason 
sysconf() can't be a library reading /proc).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Jonathan Lundell

At 5:07 PM -0700 2001-05-30, H. Peter Anvin wrote:
>  > If you now want to set those values from a userspace program / script in
>>  a portable manner, you need to be able to find out of HZ of the currently
>>  running kernel.
>>
>
>Yes, but that's because the interfaces are broken.  The decision has
>been that these values should be exported using the default HZ for the
>architecture, and that it is the kernel's responsibility to scale them
>when HZ != USER_HZ.  I don't know if any work has been done in this
>area.

FWIW (perhaps not much in this context), the POSIX way is sysconf(_SC_CLK_TCK)

POSIX sysconf is pretty useful for this kind of thing (not just HZ, either).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Jonathan Lundell

At 5:07 PM -0700 2001-05-30, H. Peter Anvin wrote:
   If you now want to set those values from a userspace program / script in
  a portable manner, you need to be able to find out of HZ of the currently
  running kernel.


Yes, but that's because the interfaces are broken.  The decision has
been that these values should be exported using the default HZ for the
architecture, and that it is the kernel's responsibility to scale them
when HZ != USER_HZ.  I don't know if any work has been done in this
area.

FWIW (perhaps not much in this context), the POSIX way is sysconf(_SC_CLK_TCK)

POSIX sysconf is pretty useful for this kind of thing (not just HZ, either).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: How to know HZ from userspace?

2001-05-30 Thread Jonathan Lundell

At 1:38 AM +0100 2001-05-31, Joel Becker wrote:
On Wed, May 30, 2001 at 05:24:37PM -0700, Jonathan Lundell wrote:
  FWIW (perhaps not much in this context), the POSIX way is 
sysconf(_SC_CLK_TCK)

  POSIX sysconf is pretty useful for this kind of thing (not just HZ, either).

   Well, how many hundred things on Linux are available from /proc
but not from sysconf or the like?  :-)

Joel

Lots. Maybe we oughta have /proc/sysconf/... (there's no reason 
sysconf() can't be a library reading /proc).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] large stack variables (>=1K) in 2.4.4 and 2.4.4-ac8

2001-05-25 Thread Jonathan Lundell

At 8:45 AM -0700 2001-05-25, dean gaudet wrote:
>i think it really depends on how you use current -- here's an alternative
>usage which can fold the extra addition into the structure offset
>calculations, and moves the task struct to the top of the stack.
>
>not that this really solves anything, 'cause a stack underflow will just
>trash something else rather than the task struct :)

It would open the door for putting a guard page (which only occupies 
virtual space, after all) below the stack. I have no idea whether 
that's practical, given other constraints, but it's a potential 
benefit of having the stack at the bottom rather than the top of a 
page.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [CHECKER] large stack variables (=1K) in 2.4.4 and 2.4.4-ac8

2001-05-25 Thread Jonathan Lundell

At 8:45 AM -0700 2001-05-25, dean gaudet wrote:
i think it really depends on how you use current -- here's an alternative
usage which can fold the extra addition into the structure offset
calculations, and moves the task struct to the top of the stack.

not that this really solves anything, 'cause a stack underflow will just
trash something else rather than the task struct :)

It would open the door for putting a guard page (which only occupies 
virtual space, after all) below the stack. I have no idea whether 
that's practical, given other constraints, but it's a potential 
benefit of having the stack at the bottom rather than the top of a 
page.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 5:56 PM +0200 2001-05-24, Andi Kleen wrote:
>On Thu, May 24, 2001 at 08:50:04AM -0700, Jonathan Lundell wrote:
>  > At 10:31 AM +0200 2001-05-24, Andi Kleen wrote:
>>  >reiserfs doesn't, but the HD usually has transparently in its firmware.
>>  >So it hits a bad block; you see an IO error and the next time you hit
>>  >the block the firmware has mapped in a fresh one from its internal
>>  >reserves.
>>
>>  Drives have remapping capability, but it's the first I've heard of HD
>>  firmware doing it automatically. I'd be very interested in reading
>>  the relevant documentation, if you could provide a pointer. Seems to
>>  me if a drive *could* do this, you'd certainly want to turn it
>>  (automatic remapping) off. There's way too much chance that a system
>>  will read the remapped sector and assume that it contains the
>>  original data. That would be hopelessly corrupting.
>
>There are two scenarios: read and write. For write doing remapping transparent
>is all fine, as the data is destroyed anyways.
>For read it returns an IO error once and the next time you read from that
>block it contains fresh (or partly recovered) data.

What HDs are we talking about, specifically?

WRT writes, how does the drive detect the error?

WRT reads, there are too many filesystems that would accept the 
second (no-IO-error) read as being the original good data.

IBM's UltraStar drives have an option (a bit in a vendor-unique mode 
page) that enables automatic reassignment, but it's done safely. If 
an unrecoverable read error is reported, the block is entered in a 
list of reassignment candidates. If that block is subsequently 
written, it's written back to the original location, and then 
verified. If the verify fails, the block is reassigned and rewritten; 
if it succeeds, it's left in the original location, and the block is 
removed from the reassignment candidate list.

Notice that invalid data is never returned without an error 
indication. That's critical.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 12:19 PM +0200 2001-05-24, Jens Axboe wrote:
>In fact you will typically only see an I/O error if the drive _can't_
>remap the sector anymore, because it has run out. No point in reporting
>a condition that was recovered.
>
>I'd still say, that if you get bad block errors reported from your disk
>it's long overdue for replacement.

This can't be right. It implies that the drive is returning bogus 
data with no error indication. Remapping a bad sector is not the same 
as recovering it.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 10:31 AM +0200 2001-05-24, Andi Kleen wrote:
>reiserfs doesn't, but the HD usually has transparently in its firmware.
>So it hits a bad block; you see an IO error and the next time you hit
>the block the firmware has mapped in a fresh one from its internal
>reserves.

Drives have remapping capability, but it's the first I've heard of HD 
firmware doing it automatically. I'd be very interested in reading 
the relevant documentation, if you could provide a pointer. Seems to 
me if a drive *could* do this, you'd certainly want to turn it 
(automatic remapping) off. There's way too much chance that a system 
will read the remapped sector and assume that it contains the 
original data. That would be hopelessly corrupting.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 12:19 PM +0200 2001-05-24, Jens Axboe wrote:
In fact you will typically only see an I/O error if the drive _can't_
remap the sector anymore, because it has run out. No point in reporting
a condition that was recovered.

I'd still say, that if you get bad block errors reported from your disk
it's long overdue for replacement.

This can't be right. It implies that the drive is returning bogus 
data with no error indication. Remapping a bad sector is not the same 
as recovering it.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 10:31 AM +0200 2001-05-24, Andi Kleen wrote:
reiserfs doesn't, but the HD usually has transparently in its firmware.
So it hits a bad block; you see an IO error and the next time you hit
the block the firmware has mapped in a fresh one from its internal
reserves.

Drives have remapping capability, but it's the first I've heard of HD 
firmware doing it automatically. I'd be very interested in reading 
the relevant documentation, if you could provide a pointer. Seems to 
me if a drive *could* do this, you'd certainly want to turn it 
(automatic remapping) off. There's way too much chance that a system 
will read the remapped sector and assume that it contains the 
original data. That would be hopelessly corrupting.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Dying disk and filesystem choice.

2001-05-24 Thread Jonathan Lundell

At 5:56 PM +0200 2001-05-24, Andi Kleen wrote:
On Thu, May 24, 2001 at 08:50:04AM -0700, Jonathan Lundell wrote:
   At 10:31 AM +0200 2001-05-24, Andi Kleen wrote:
  reiserfs doesn't, but the HD usually has transparently in its firmware.
  So it hits a bad block; you see an IO error and the next time you hit
  the block the firmware has mapped in a fresh one from its internal
  reserves.

  Drives have remapping capability, but it's the first I've heard of HD
  firmware doing it automatically. I'd be very interested in reading
  the relevant documentation, if you could provide a pointer. Seems to
  me if a drive *could* do this, you'd certainly want to turn it
  (automatic remapping) off. There's way too much chance that a system
  will read the remapped sector and assume that it contains the
  original data. That would be hopelessly corrupting.

There are two scenarios: read and write. For write doing remapping transparent
is all fine, as the data is destroyed anyways.
For read it returns an IO error once and the next time you read from that
block it contains fresh (or partly recovered) data.

What HDs are we talking about, specifically?

WRT writes, how does the drive detect the error?

WRT reads, there are too many filesystems that would accept the 
second (no-IO-error) read as being the original good data.

IBM's UltraStar drives have an option (a bit in a vendor-unique mode 
page) that enables automatic reassignment, but it's done safely. If 
an unrecoverable read error is reported, the block is entered in a 
list of reassignment candidates. If that block is subsequently 
written, it's written back to the original location, and then 
verified. If the verify fails, the block is reassigned and rewritten; 
if it succeeds, it's left in the original location, and the block is 
removed from the reassignment candidate list.

Notice that invalid data is never returned without an error 
indication. That's critical.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 10:24 PM +0100 2001-05-22, Alan Cox wrote:
>  > On the main board, and not just the old ones. These days it's
>>  typically in the chipset's south bridge. "Third-party DMA" is
>>  sometimes called "fly-by DMA". The ISA card is a slave, as is memory,
>>  and the DMA chip reads from one ands writes to the other.
>
>There is also another mode which will give the Alpha kittens I suspect. A
>few PCI cards do SB emulation by snooping the PCI bus. So the kernel writes
>to the ISA DMA controller which does a pointless ISA transfer and the PCI
>card sniffs the DMA controller setup (as it goes to pci, then when nobody
>claims it on to the isa bridge) then does bus mastering DMA of its own to fake
>the ISA dma

That's sick.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 2:02 PM -0700 2001-05-22, Richard Henderson wrote:
>On Tue, May 22, 2001 at 01:48:23PM -0700, Jonathan Lundell wrote:
>>  64KB for 8-bit DMA; 128KB for 16-bit DMA. [...]  This doesn't
>>  apply to bus-master DMA, just the legacy (8237) stuff.
>
>Would this 8237 be something on the ISA card, or something on
>the old pc mainboards?  I'm wondering if we can safely ignore
>this issue altogether here...

On the main board, and not just the old ones. These days it's 
typically in the chipset's south bridge. "Third-party DMA" is 
sometimes called "fly-by DMA". The ISA card is a slave, as is memory, 
and the DMA chip reads from one ands writes to the other.

IDE didn't originally use DMA at all (but floppies did), just 
programmed IO. These days, PC chipsets mostly have some form of 
extended higher-performance DMA facilities for stuff like IDE, but 
I'm not really familiar with the details.

I do wish Linux didn't have so much PC legacy sh^Htuff 
embedded into the i386 architecture.

>  > There was also a 24-bit address limitation.
>
>Yes, that's in the number of address lines going to the isa card.
>We work around that one by having an iommu arena from 8M to 16M
>and forcing all ISA traffic to go through there.


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 1:28 PM -0700 2001-05-22, Richard Henderson wrote:
>On Tue, May 22, 2001 at 05:00:16PM +0200, Andrea Arcangeli wrote:
>>  I'm also wondering if ISA needs the sg to start on a 64k boundary,
>
>Traditionally, ISA could not do DMA across a 64k boundary.
>
>The only ISA card I have (a soundblaster compatible) appears
>to work without caring for this, but I suppose we should pay
>lip service to pedantics.

64KB for 8-bit DMA; 128KB for 16-bit DMA. It's a limitation of the 
legacy third-party-DMA controllers, which had only 16-bit address 
registers (the high part of the address lives in a non-counting 
register). This doesn't apply to bus-master DMA, just the legacy 
(8237) stuff. There was also a 24-bit address limitation.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 11:12 PM +1200 2001-05-22, Chris Wedgwood wrote:
>On Mon, May 21, 2001 at 03:19:54AM -0700, David S. Miller wrote:
>
> Electrically (someone correct me, I'm probably wrong) PCI is
> limited to 6 physical plug-in slots I believe, let's say it's 8
> to choose an arbitrary larger number to be safe.
>
>Minor nit... it can in fact be higher than this, but typically it is
>not. CompactPCI implementations may go higher (different electrical
>characteristics allow for this).

Compact PCI specifies a max of 8 slots (one of which is typically the 
system board). Regular PCI doesn't have a hard and fast slot limit 
(except for the logical limit of 32 devices per bus); the limits are 
driven by electrical loading concerns. As I recall, a bus of typical 
length can accommodate 10 "loads", where a load is either a device 
pin or a slot connector (that is, an expansion card counts as two 
loads, one for the device and one for the connector). (I take this to 
be a rule of thumb, not a hard spec, based on the detailed electrical 
requirements in the PCI spec.)

Still, the presence of bridges opens up the number of devices on a 
root PCI bus to a very high number, logically. Certainly having three 
or four quad Ethernet cards, so 12 or 16 devices, is a plausible 
configuration. As for bandwidth, a 64x66 PCI bus has a nominal burst 
bandwidth of 533 MB/second, which would be saturated by 20 full 
duplex 100baseT ports that were themselves saturated in both 
directions (all ignoring overhead). Full saturation is not reasonable 
for either PCI or Ethernet; I'm just looking at order-of-magnitude 
numbers here.

The bottom line is: don't make any hard and fast assumption about the 
number of devices connected to a root PCI bus.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 11:12 PM +1200 2001-05-22, Chris Wedgwood wrote:
On Mon, May 21, 2001 at 03:19:54AM -0700, David S. Miller wrote:

 Electrically (someone correct me, I'm probably wrong) PCI is
 limited to 6 physical plug-in slots I believe, let's say it's 8
 to choose an arbitrary larger number to be safe.

Minor nit... it can in fact be higher than this, but typically it is
not. CompactPCI implementations may go higher (different electrical
characteristics allow for this).

Compact PCI specifies a max of 8 slots (one of which is typically the 
system board). Regular PCI doesn't have a hard and fast slot limit 
(except for the logical limit of 32 devices per bus); the limits are 
driven by electrical loading concerns. As I recall, a bus of typical 
length can accommodate 10 loads, where a load is either a device 
pin or a slot connector (that is, an expansion card counts as two 
loads, one for the device and one for the connector). (I take this to 
be a rule of thumb, not a hard spec, based on the detailed electrical 
requirements in the PCI spec.)

Still, the presence of bridges opens up the number of devices on a 
root PCI bus to a very high number, logically. Certainly having three 
or four quad Ethernet cards, so 12 or 16 devices, is a plausible 
configuration. As for bandwidth, a 64x66 PCI bus has a nominal burst 
bandwidth of 533 MB/second, which would be saturated by 20 full 
duplex 100baseT ports that were themselves saturated in both 
directions (all ignoring overhead). Full saturation is not reasonable 
for either PCI or Ethernet; I'm just looking at order-of-magnitude 
numbers here.

The bottom line is: don't make any hard and fast assumption about the 
number of devices connected to a root PCI bus.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 10:24 PM +0100 2001-05-22, Alan Cox wrote:
   On the main board, and not just the old ones. These days it's
  typically in the chipset's south bridge. Third-party DMA is
  sometimes called fly-by DMA. The ISA card is a slave, as is memory,
  and the DMA chip reads from one ands writes to the other.

There is also another mode which will give the Alpha kittens I suspect. A
few PCI cards do SB emulation by snooping the PCI bus. So the kernel writes
to the ISA DMA controller which does a pointless ISA transfer and the PCI
card sniffs the DMA controller setup (as it goes to pci, then when nobody
claims it on to the isa bridge) then does bus mastering DMA of its own to fake
the ISA dma

That's sick.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 2:02 PM -0700 2001-05-22, Richard Henderson wrote:
On Tue, May 22, 2001 at 01:48:23PM -0700, Jonathan Lundell wrote:
  64KB for 8-bit DMA; 128KB for 16-bit DMA. [...]  This doesn't
  apply to bus-master DMA, just the legacy (8237) stuff.

Would this 8237 be something on the ISA card, or something on
the old pc mainboards?  I'm wondering if we can safely ignore
this issue altogether here...

On the main board, and not just the old ones. These days it's 
typically in the chipset's south bridge. Third-party DMA is 
sometimes called fly-by DMA. The ISA card is a slave, as is memory, 
and the DMA chip reads from one ands writes to the other.

IDE didn't originally use DMA at all (but floppies did), just 
programmed IO. These days, PC chipsets mostly have some form of 
extended higher-performance DMA facilities for stuff like IDE, but 
I'm not really familiar with the details.

asideI do wish Linux didn't have so much PC legacy sh^Htuff 
embedded into the i386 architecture./aside

   There was also a 24-bit address limitation.

Yes, that's in the number of address lines going to the isa card.
We work around that one by having an iommu arena from 8M to 16M
and forcing all ISA traffic to go through there.


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-22 Thread Jonathan Lundell

At 1:28 PM -0700 2001-05-22, Richard Henderson wrote:
On Tue, May 22, 2001 at 05:00:16PM +0200, Andrea Arcangeli wrote:
  I'm also wondering if ISA needs the sg to start on a 64k boundary,

Traditionally, ISA could not do DMA across a 64k boundary.

The only ISA card I have (a soundblaster compatible) appears
to work without caring for this, but I suppose we should pay
lip service to pedantics.

64KB for 8-bit DMA; 128KB for 16-bit DMA. It's a limitation of the 
legacy third-party-DMA controllers, which had only 16-bit address 
registers (the high part of the address lives in a non-counting 
register). This doesn't apply to bus-master DMA, just the legacy 
(8237) stuff. There was also a 24-bit address limitation.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-21 Thread Jonathan Lundell

At 3:19 AM -0700 2001-05-21, David S. Miller wrote:
>This is totally wrong in two ways.
>
>Let me fix this, the IOMMU on these machines is per PCI bus, so this
>figure should be drastically lower.
>
>Electrically (someone correct me, I'm probably wrong) PCI is limited
>to 6 physical plug-in slots I believe, let's say it's 8 to choose an
>arbitrary larger number to be safe.
>
>Then we have:
>
>max bytes per bttv: max_gbuffers * max_gbufsize
>   64   * 0x208000  == 133.12MB
>
>133.12MB * 8 PCI slots == ~1.06 GB
>
>Which is still only half of the total IOMMU space available per
>controller.

8 slots (and  you're right, 6 is a practical upper limit, fewer for 
66 MHz) *per bus*. Buses can proliferate like crazy, so the slot 
limit becomes largely irrelevant. A typical quad Ethernet card, for 
example (and this is true for many/most multiple-device cards), has a 
bridge, its own internal PCI bus, and four "slots" ("devices" in PCI 
terminology).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: alpha iommu fixes

2001-05-21 Thread Jonathan Lundell

At 3:19 AM -0700 2001-05-21, David S. Miller wrote:
This is totally wrong in two ways.

Let me fix this, the IOMMU on these machines is per PCI bus, so this
figure should be drastically lower.

Electrically (someone correct me, I'm probably wrong) PCI is limited
to 6 physical plug-in slots I believe, let's say it's 8 to choose an
arbitrary larger number to be safe.

Then we have:

max bytes per bttv: max_gbuffers * max_gbufsize
   64   * 0x208000  == 133.12MB

133.12MB * 8 PCI slots == ~1.06 GB

Which is still only half of the total IOMMU space available per
controller.

8 slots (and  you're right, 6 is a practical upper limit, fewer for 
66 MHz) *per bus*. Buses can proliferate like crazy, so the slot 
limit becomes largely irrelevant. A typical quad Ethernet card, for 
example (and this is true for many/most multiple-device cards), has a 
bridge, its own internal PCI bus, and four slots (devices in PCI 
terminology).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-20 Thread Jonathan Lundell

At 2:16 AM +1200 2001-05-21, Chris Wedgwood wrote:
>On Sat, May 19, 2001 at 10:36:14AM -0700, Jonathan Lundell wrote:
>
> I know from system documentation, or can figure out once and for
> all by experimentation, the correspondence between PCI
> bus/dev/fcn and physical locations. Jeff's extension gives me the
> mapping between eth# and PCI bus/dev/fcn, which is not otherwise
> available (outside the kernel).
>
>Won't work with hotplug PCI (consider plugging in something with a
>bridge).

It's true that hotplug devices make it more complicated, but I think 
the result can be achieved by describing the correspondence 
topologically rather than as a simple b/d/f-to-location table.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-20 Thread Jonathan Lundell

At 3:37 AM -0600 2001-05-20, Eric W. Biederman wrote:
>Jonathan Lundell <[EMAIL PROTECTED]> writes:
>
>>  At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
>>  >  > Jeff Garzik's ethtool
>>  >  > extension at least tells me the PCI bus/dev/fcn, though, and from
>>  >>  that I can write a userland mapping function to the physical
>>  >>  location.
>>  >
>>  >I don't see how PCI bus/dev/fcn lets you do that.
>>
>>  I know from system documentation, or can figure out once and for all
>>  by experimentation, the correspondence between PCI bus/dev/fcn and
>>  physical locations. Jeff's extension gives me the mapping between
>>  eth# and PCI bus/dev/fcn, which is not otherwise available (outside
>>  the kernel).
>
>Just a second let me reenumerate your pci busses, and change all of the bus
>numbers.  Not that this is a bad thought.  It is just you need to know
>the tree of PCI busses/bridges up to the root on the machine in question.

Yes, you do. And it's true that renumbering is problematical; I 
hadn't thought of all the implications. Say, you have a system with 
hot-plug slots on two buses, and someone hot-plugs a card with a 
bridge (fairly common; most dual/quad Ethernet boards have a bridge). 
If the buses were numbered densely to begin with, they're going to 
have to be renumbered above the point that the new bridge was added.

Phooey. Well, it can still be done, but it's a bit more complicated 
than the bus/dev/fcn-to-location map I was imagining. You'd have to 
describe the topology of the built-in buses, and dynamically make the 
correspondences. As you say, "know the tree", by topology, not bus 
numbers.


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-20 Thread Jonathan Lundell

At 3:37 AM -0600 2001-05-20, Eric W. Biederman wrote:
Jonathan Lundell [EMAIL PROTECTED] writes:

  At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
 Jeff Garzik's ethtool
 extension at least tells me the PCI bus/dev/fcn, though, and from
that I can write a userland mapping function to the physical
location.
  
  I don't see how PCI bus/dev/fcn lets you do that.

  I know from system documentation, or can figure out once and for all
  by experimentation, the correspondence between PCI bus/dev/fcn and
  physical locations. Jeff's extension gives me the mapping between
  eth# and PCI bus/dev/fcn, which is not otherwise available (outside
  the kernel).

Just a second let me reenumerate your pci busses, and change all of the bus
numbers.  Not that this is a bad thought.  It is just you need to know
the tree of PCI busses/bridges up to the root on the machine in question.

Yes, you do. And it's true that renumbering is problematical; I 
hadn't thought of all the implications. Say, you have a system with 
hot-plug slots on two buses, and someone hot-plugs a card with a 
bridge (fairly common; most dual/quad Ethernet boards have a bridge). 
If the buses were numbered densely to begin with, they're going to 
have to be renumbered above the point that the new bridge was added.

Phooey. Well, it can still be done, but it's a bit more complicated 
than the bus/dev/fcn-to-location map I was imagining. You'd have to 
describe the topology of the built-in buses, and dynamically make the 
correspondences. As you say, know the tree, by topology, not bus 
numbers.


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-20 Thread Jonathan Lundell

At 2:16 AM +1200 2001-05-21, Chris Wedgwood wrote:
On Sat, May 19, 2001 at 10:36:14AM -0700, Jonathan Lundell wrote:

 I know from system documentation, or can figure out once and for
 all by experimentation, the correspondence between PCI
 bus/dev/fcn and physical locations. Jeff's extension gives me the
 mapping between eth# and PCI bus/dev/fcn, which is not otherwise
 available (outside the kernel).

Won't work with hotplug PCI (consider plugging in something with a
bridge).

It's true that hotplug devices make it more complicated, but I think 
the result can be achieved by describing the correspondence 
topologically rather than as a simple b/d/f-to-location table.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-19 Thread Jonathan Lundell

At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
>  > >Make your config script look at the hardware MAC addresses. Those don't
>>  >change.
>>
>>  They're not necessarily unique, though.
>
>So if you plug both into the same network segment, that segment is broken? 
>That looks like very stupid design to me.
>
>It's not as if getting enough unique MAC addresses was particularly 
>expensive. These days, even el-cheapo PC network cards get that right. 
>(And have for quite a number of years.)

Many do, some don't. Moreover, the MAC address is volatile in that it 
can be changed at will (via, eg, ifconfig).

I assume that the reason that Sun (for example) defaults to all MAC 
addresses on a system being the same is that it doesn't make sense, 
ordinarily, to plug two Ethernet interfaces into the same network 
segment. If, for some reason, you really want to do that, there's 
ifconfig ready to reassign the MAC address.

If I plug both into the same network segments by accident (because I 
can't tell which is which, say), then my configuration is nearly as 
broken with different MAC addresses as with identical ones; the fix 
is to replug correctly, not to change MAC addresses.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-19 Thread Jonathan Lundell

At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
>  > Jeff Garzik's ethtool
>  > extension at least tells me the PCI bus/dev/fcn, though, and from
>>  that I can write a userland mapping function to the physical
>>  location.
>
>I don't see how PCI bus/dev/fcn lets you do that.

I know from system documentation, or can figure out once and for all 
by experimentation, the correspondence between PCI bus/dev/fcn and 
physical locations. Jeff's extension gives me the mapping between 
eth# and PCI bus/dev/fcn, which is not otherwise available (outside 
the kernel).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-19 Thread Jonathan Lundell

At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
   Make your config script look at the hardware MAC addresses. Those don't
  change.

  They're not necessarily unique, though.

So if you plug both into the same network segment, that segment is broken? 
That looks like very stupid design to me.

It's not as if getting enough unique MAC addresses was particularly 
expensive. These days, even el-cheapo PC network cards get that right. 
(And have for quite a number of years.)

Many do, some don't. Moreover, the MAC address is volatile in that it 
can be changed at will (via, eg, ifconfig).

I assume that the reason that Sun (for example) defaults to all MAC 
addresses on a system being the same is that it doesn't make sense, 
ordinarily, to plug two Ethernet interfaces into the same network 
segment. If, for some reason, you really want to do that, there's 
ifconfig ready to reassign the MAC address.

If I plug both into the same network segments by accident (because I 
can't tell which is which, say), then my configuration is nearly as 
broken with different MAC addresses as with identical ones; the fix 
is to replug correctly, not to change MAC addresses.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-19 Thread Jonathan Lundell

At 10:42 AM +0200 2001-05-19, Kai Henningsen wrote:
   Jeff Garzik's ethtool
   extension at least tells me the PCI bus/dev/fcn, though, and from
  that I can write a userland mapping function to the physical
  location.

I don't see how PCI bus/dev/fcn lets you do that.

I know from system documentation, or can figure out once and for all 
by experimentation, the correspondence between PCI bus/dev/fcn and 
physical locations. Jeff's extension gives me the mapping between 
eth# and PCI bus/dev/fcn, which is not otherwise available (outside 
the kernel).
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Storage - redundant path failover / failback - quo vadis linux?

2001-05-18 Thread Jonathan Lundell

At 9:03 AM +0200 2001-05-18, [EMAIL PROTECTED] wrote:
>  >My question is which way is the more probable solution for future linux
>  >kernels?
>  >The low-level-approach of the "T3"-patch requires changes to the
>  >scsi-drivers and the hardware-drivers but provides optimal communication
>  >between the driver and the hardware
>
>Thinking about it: if there would be some sort of 'available' flag 
>in the gendisk structure, that would be updated by the low-level 
>drivers. This could the used by a high-level design to use or skip a 
>failed device/path... In the S/390 (or zSeries) environment the 
>device drivers are even able to detect a failing connection even if 
>there is no data going to a device. That way the device would be 
>disabled even _before_ anybody tries to write...
>
>  >The high-level-approach of the "multipath"-personality is
>  >hardware-independant but works very slowly. On the other hand I see no
>  >clear way how to check for availability of the (previously failed) primary
>  >channel to automate a fail-back.
>
>Well, slower, but I think there will be many that take that 
>performance loss already by using lvm or md (for the benefit of 
>flexible/large filesystems) this approach would add failover while 
>beeing IMHO only a little less performant.

The flag idea, or some equivalent way for the low-level driver to 
communicate to the multi-pathing level, seems exactly right. I'm 
guessing that provision needs to be made for some 
external-device-dependent means of signalling both failure and 
recovery. There are potentially side-channel/out-of-band means to 
communicate this kind of status from specific devices.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Storage - redundant path failover / failback - quo vadis linux?

2001-05-18 Thread Jonathan Lundell

At 9:03 AM +0200 2001-05-18, [EMAIL PROTECTED] wrote:
  My question is which way is the more probable solution for future linux
  kernels?
  The low-level-approach of the T3-patch requires changes to the
  scsi-drivers and the hardware-drivers but provides optimal communication
  between the driver and the hardware

Thinking about it: if there would be some sort of 'available' flag 
in the gendisk structure, that would be updated by the low-level 
drivers. This could the used by a high-level design to use or skip a 
failed device/path... In the S/390 (or zSeries) environment the 
device drivers are even able to detect a failing connection even if 
there is no data going to a device. That way the device would be 
disabled even _before_ anybody tries to write...

  The high-level-approach of the multipath-personality is
  hardware-independant but works very slowly. On the other hand I see no
  clear way how to check for availability of the (previously failed) primary
  channel to automate a fail-back.

Well, slower, but I think there will be many that take that 
performance loss already by using lvm or md (for the benefit of 
flexible/large filesystems) this approach would add failover while 
beeing IMHO only a little less performant.

The flag idea, or some equivalent way for the low-level driver to 
communicate to the multi-pathing level, seems exactly right. I'm 
guessing that provision needs to be made for some 
external-device-dependent means of signalling both failure and 
recovery. There are potentially side-channel/out-of-band means to 
communicate this kind of status from specific devices.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-17 Thread Jonathan Lundell

At 11:23 PM +0200 2001-05-17, Kai Henningsen wrote:
>[EMAIL PROTECTED] (Jonathan Lundell)  wrote on 15.05.01 in 
><p05100316b7272cdfd50c@[207.213.214.37]>:
>
>>  What about:
>>
>>  1 (network domain). I have two network interfaces that I connect to
>>  two different network segments, eth0 & eth1; they're ifconfig'd to
>>  the appropriate IP and MAC addresses. I really do need to know
>>  physically which (physical) hole to plug my eth0 cable into.
>
>Sorry, the software doesn't know that. Never has, for that matter.

Well, no, it doesn't. That's a problem. Jeff Garzik's ethtool 
extension at least tells me the PCI bus/dev/fcn, though, and from 
that I can write a userland mapping function to the physical 
location. My point, though, is that finding the socket is a real-life 
problem on systems with multiple interfaces. I don't expect the 
kernel to know the physical locations, but the user has to be able to 
get from kernel/ifconfig names (eth#) to sockets, one way or another. 
Support for a uniform means of doing the mapping, even if it needs 
userland help, would be good.

>  > (Extension: same situation, but it's a firewall and I've got 12 ports
>>  to connect.) (Extension #2: if I add a NIC to the system and reboot,
>>  I'd really prefer that the NICs already in use didn't get renumbered.)
>
>Make your config script look at the hardware MAC addresses. Those don't
>change.

They're not necessarily unique, though.

>  > 2 (disk domain). I have multiple spindles on multiple SCSI adapters.
>>  I want to allocate them to more than one RAID0/1/5 set, with the
>>  usual considerations of putting mirrors on different adapters,
>>  spreading my RAID5 drives optimally, ditto stripes. I need (eg) SCSI
>>  paths to config all this, and I further need real physical locations
>>  to identify failed drives that need to be hot-replaced. The mirror
>>  members will move around as drives are replaced and hot spares come
>>  into play.
>
>Use partition UUIDs, or SCSI serial numbers, or whatever. This works
>today.

This pushes the problem back in time: I need to write the UUID, for 
example, at some point. And, with hot-swappable drives, I'm still 
interested in the physical location. I really know know that there's 
a good answer to this problem, especially with FC, but I need to tell 
an operator, "replace this particular physical drive". It doesn't do 
any good to tell the operator the UUID.

>  > Seems like more that merely informational.
>
>The *location*? Nope. Some unique id for the device, if available at all:
>sure.

What good does it do to tell an operator to connect a cable to a MAC 
address? Or to remove a drive having a particular UUID? If it's "mere 
information", it's *necessary* mere information.

>  > (A side observation: PCI or SCSI bus/device/lun/etc paths are not
>>  physical locations; you also need external hardware-specific
>>  knowledge to be able to talk about real physical locations in a way
>>  that does the system operator any good.)
>
>And those you typically do not have.

But (ideally) should.

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-17 Thread Jonathan Lundell

At 11:23 PM +0200 2001-05-17, Kai Henningsen wrote:
[EMAIL PROTECTED] (Jonathan Lundell)  wrote on 15.05.01 in 
p05100316b7272cdfd50c@[207.213.214.37]:

  What about:

  1 (network domain). I have two network interfaces that I connect to
  two different network segments, eth0  eth1; they're ifconfig'd to
  the appropriate IP and MAC addresses. I really do need to know
  physically which (physical) hole to plug my eth0 cable into.

Sorry, the software doesn't know that. Never has, for that matter.

Well, no, it doesn't. That's a problem. Jeff Garzik's ethtool 
extension at least tells me the PCI bus/dev/fcn, though, and from 
that I can write a userland mapping function to the physical 
location. My point, though, is that finding the socket is a real-life 
problem on systems with multiple interfaces. I don't expect the 
kernel to know the physical locations, but the user has to be able to 
get from kernel/ifconfig names (eth#) to sockets, one way or another. 
Support for a uniform means of doing the mapping, even if it needs 
userland help, would be good.

   (Extension: same situation, but it's a firewall and I've got 12 ports
  to connect.) (Extension #2: if I add a NIC to the system and reboot,
  I'd really prefer that the NICs already in use didn't get renumbered.)

Make your config script look at the hardware MAC addresses. Those don't
change.

They're not necessarily unique, though.

   2 (disk domain). I have multiple spindles on multiple SCSI adapters.
  I want to allocate them to more than one RAID0/1/5 set, with the
  usual considerations of putting mirrors on different adapters,
  spreading my RAID5 drives optimally, ditto stripes. I need (eg) SCSI
  paths to config all this, and I further need real physical locations
  to identify failed drives that need to be hot-replaced. The mirror
  members will move around as drives are replaced and hot spares come
  into play.

Use partition UUIDs, or SCSI serial numbers, or whatever. This works
today.

This pushes the problem back in time: I need to write the UUID, for 
example, at some point. And, with hot-swappable drives, I'm still 
interested in the physical location. I really know know that there's 
a good answer to this problem, especially with FC, but I need to tell 
an operator, replace this particular physical drive. It doesn't do 
any good to tell the operator the UUID.

   Seems like more that merely informational.

The *location*? Nope. Some unique id for the device, if available at all:
sure.

What good does it do to tell an operator to connect a cable to a MAC 
address? Or to remove a drive having a particular UUID? If it's mere 
information, it's *necessary* mere information.

   (A side observation: PCI or SCSI bus/device/lun/etc paths are not
  physical locations; you also need external hardware-specific
  knowledge to be able to talk about real physical locations in a way
  that does the system operator any good.)

And those you typically do not have.

But (ideally) should.

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ((struct pci_dev*)dev)->resource[...].start

2001-05-16 Thread Jonathan Lundell

At 5:37 PM -0400 2001-05-16, Jeff Garzik wrote:
>This is not a safe assumption, because the OS may reprogram the PCI BARs
>at certain times.  The rule is:  ALWAYS read from dev->resource[] unless
>you are a bus driver (PCI bridges, for example, need to assign
>resources).

Would you please elaborate? If I understand what you're saying, you 
can't rely on the "pointer" returned by ioremap() because the OS 
might reprogram the relevant BAR out from under you. So one would 
need to know: when does a driver have to re-ioremap() due to the BAR 
having been (potentially) changed? I'd expect the answer to be: for 
all practical purposes never.

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 4:57 PM +0200 2001-05-16, Vojtech Pavlik wrote:
>On Wed, May 16, 2001 at 07:37:45AM -0700, Jonathan Lundell wrote:
>>  At 10:02 AM +0200 2001-05-16, Vojtech Pavlik wrote:
>>  >  > It's also  true that some buses simply don't yield up physical
>>  >>  locations (ISA springs to mind,
>>  >
>>  >ISA is quite fine, you can use the i/o space as physical locations.
>>
>>  I meant physical not as in physical-vs-virtual addresses (all ISA
>>  addresses, memory or IO, are physical in this sense, by the time they
>>  get to the bus). Rather, I meant that you can't determine which slot
>>  a given device is plugged into. If you have two NICs in two ISA
>>  slots, there's no way to distinguish between the slots. In practice,
>>  you'd have to experiment or remove a card and check the jumpering or
>>  some such.
>
>Yes. But I meant that while this indeed is not possible, still the i/o
>port address can be used instead of the slot number, because it at least
>is physically jumpered and must be unique.

Yes, I agree. And it's stable (whereas "physical" PCI addresses are 
not). Best we've got for ISA (though it's true for ISA memory 
addresses as well).

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 11:56 AM +0200 2001-05-16, Chemolli Francesco (USI) wrote:
>We could do something like baptizing disks.. Fix some location
>(i.e. the absolutely last sector of the disk or the partition table or
>whatever) and store there some 32-bit ID
>(could be a random number, a progressive number, whatever).

Most of these solutions (and RAID IDs and UUIDs) don't completely 
solve the problem; they just push it to a different time: how do you 
talk about a new disk, or a new RAID array, or a moved disk? And what 
about removable media (not neglecting the possibility of multiple 
drives)? Removable media from another OS? Shared drives?

Not that this kind of "firm" ID might not be an improvement, or at 
least a good sanity check.

[Side question, not original with me: why isn't all this a 2.5 discussion?]
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 10:02 AM +0200 2001-05-16, Vojtech Pavlik wrote:
>  > It's also  true that some buses simply don't yield up physical
>>  locations (ISA springs to mind,
>
>ISA is quite fine, you can use the i/o space as physical locations.

I meant physical not as in physical-vs-virtual addresses (all ISA 
addresses, memory or IO, are physical in this sense, by the time they 
get to the bus). Rather, I meant that you can't determine which slot 
a given device is plugged into. If you have two NICs in two ISA 
slots, there's no way to distinguish between the slots. In practice, 
you'd have to experiment or remove a card and check the jumpering or 
some such.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 12:31 PM +1000 2001-05-16, Andrew Morton wrote:
>  > When I ifconfig one of a collection of interfaces, I'm very much
>>  talking about the specific physical interface connected via a
>  > specific physical cable to a specific physical switch port.
>>
>
>Yes, it can be a security trap as well - physically move a card and
>your firewall rules end up being applied to the wrong connection.
>
>The 2.4 kernel allows you to rename an interface.  So you can build
>a little database of (MAC address/name) pairs. Apply this after booting
>and before bringing up the interfaces and everything has the name
>you wanted, based on MAC address.
>
>Andi Kleen has an app which does this:
>
>   ftp://ftp.firstfloor.org/pub/ak/smallsrc/nameif.c
>
>but apparently some additional kernel work is needed to make
>this work 100% correctly.  I do not know what the specific
>problem is.

There's a bit of a catch 22, though, if you don't have unique MAC 
addresses in the system (across multiple interfaces). It's common 
practice in the SPARC world (Solaris, anyway) for all the interfaces 
to default to a single system-wide MAC address. The fact that MAC 
addresses are at least semi-volatile is also bothersome.

It's also  true that some buses simply don't yield up physical 
locations (ISA springs to mind, and I gather that FC is squishy that 
way), but it's desirable to be able to make the connection all ways 
(eth# <-> bus location <-> physical location <-> MAC address) in a 
uniform manner. (Where MAC address might be something else in a 
non-Ethernet domain.)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 12:31 PM +1000 2001-05-16, Andrew Morton wrote:
   When I ifconfig one of a collection of interfaces, I'm very much
  talking about the specific physical interface connected via a
   specific physical cable to a specific physical switch port.


Yes, it can be a security trap as well - physically move a card and
your firewall rules end up being applied to the wrong connection.

The 2.4 kernel allows you to rename an interface.  So you can build
a little database of (MAC address/name) pairs. Apply this after booting
and before bringing up the interfaces and everything has the name
you wanted, based on MAC address.

Andi Kleen has an app which does this:

   ftp://ftp.firstfloor.org/pub/ak/smallsrc/nameif.c

but apparently some additional kernel work is needed to make
this work 100% correctly.  I do not know what the specific
problem is.

There's a bit of a catch 22, though, if you don't have unique MAC 
addresses in the system (across multiple interfaces). It's common 
practice in the SPARC world (Solaris, anyway) for all the interfaces 
to default to a single system-wide MAC address. The fact that MAC 
addresses are at least semi-volatile is also bothersome.

It's also  true that some buses simply don't yield up physical 
locations (ISA springs to mind, and I gather that FC is squishy that 
way), but it's desirable to be able to make the connection all ways 
(eth# - bus location - physical location - MAC address) in a 
uniform manner. (Where MAC address might be something else in a 
non-Ethernet domain.)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 10:02 AM +0200 2001-05-16, Vojtech Pavlik wrote:
   It's also  true that some buses simply don't yield up physical
  locations (ISA springs to mind,

ISA is quite fine, you can use the i/o space as physical locations.

I meant physical not as in physical-vs-virtual addresses (all ISA 
addresses, memory or IO, are physical in this sense, by the time they 
get to the bus). Rather, I meant that you can't determine which slot 
a given device is plugged into. If you have two NICs in two ISA 
slots, there's no way to distinguish between the slots. In practice, 
you'd have to experiment or remove a card and check the jumpering or 
some such.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 11:56 AM +0200 2001-05-16, Chemolli Francesco (USI) wrote:
We could do something like baptizing disks.. Fix some location
(i.e. the absolutely last sector of the disk or the partition table or
whatever) and store there some 32-bit ID
(could be a random number, a progressive number, whatever).

Most of these solutions (and RAID IDs and UUIDs) don't completely 
solve the problem; they just push it to a different time: how do you 
talk about a new disk, or a new RAID array, or a moved disk? And what 
about removable media (not neglecting the possibility of multiple 
drives)? Removable media from another OS? Shared drives?

Not that this kind of firm ID might not be an improvement, or at 
least a good sanity check.

[Side question, not original with me: why isn't all this a 2.5 discussion?]
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-16 Thread Jonathan Lundell

At 4:57 PM +0200 2001-05-16, Vojtech Pavlik wrote:
On Wed, May 16, 2001 at 07:37:45AM -0700, Jonathan Lundell wrote:
  At 10:02 AM +0200 2001-05-16, Vojtech Pavlik wrote:
 It's also  true that some buses simply don't yield up physical
locations (ISA springs to mind,
  
  ISA is quite fine, you can use the i/o space as physical locations.

  I meant physical not as in physical-vs-virtual addresses (all ISA
  addresses, memory or IO, are physical in this sense, by the time they
  get to the bus). Rather, I meant that you can't determine which slot
  a given device is plugged into. If you have two NICs in two ISA
  slots, there's no way to distinguish between the slots. In practice,
  you'd have to experiment or remove a card and check the jumpering or
  some such.

Yes. But I meant that while this indeed is not possible, still the i/o
port address can be used instead of the slot number, because it at least
is physically jumpered and must be unique.

Yes, I agree. And it's stable (whereas physical PCI addresses are 
not). Best we've got for ISA (though it's true for ISA memory 
addresses as well).

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ((struct pci_dev*)dev)-resource[...].start

2001-05-16 Thread Jonathan Lundell

At 5:37 PM -0400 2001-05-16, Jeff Garzik wrote:
This is not a safe assumption, because the OS may reprogram the PCI BARs
at certain times.  The rule is:  ALWAYS read from dev-resource[] unless
you are a bus driver (PCI bridges, for example, need to assign
resources).

Would you please elaborate? If I understand what you're saying, you 
can't rely on the pointer returned by ioremap() because the OS 
might reprogram the relevant BAR out from under you. So one would 
need to know: when does a driver have to re-ioremap() due to the BAR 
having been (potentially) changed? I'd expect the answer to be: for 
all practical purposes never.

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 9:34 PM -0400 2001-05-15, Nicolas Pitre wrote:
>On Wed, 16 May 2001, Daniel Phillips wrote:
>
>>  On Tuesday 15 May 2001 23:20, Nicolas Pitre wrote:
>>  > Personally, I'd really like to see /dev/ttyS0 be the first detected
>>  > serial port on a system, /dev/ttyS1 the second, etc.
>>
>>  There are well-defined rules for the first four on PC's.  The ttySx
>>  better match the labels the OEM put on the box.
>
>Then just make them be detected first.

Well, they traditionally start with 1, not 0, too. Or have cute 
little icons and no text. Or aren't labelled at all. I'm using one 
fairly well-known dual-port PCI serial board that silently 
interchanged the two ports on a rev change, with no labelling change 
at all ('cause there was no label!). Make your ttySx match *that*!

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 1:18 PM -0700 2001-05-15, Linus Torvalds wrote:
>  > 1 (network domain). I have two network interfaces that I connect to
>>  two different network segments, eth0 & eth1;
>
>So?
>
>Informational. You can always ask what "eth0" and "eth1" are.
>
>There's another side to this: repeatability. A setup should be
>_repeatable_.
>
>This is what we have now. Network devices are called "eth0..N", and nobody
>is complaining about the fact that the numbering is basically random. It
>is _repeatable_ as long as you don't change your hardware setup, and the
>numbering has effectively _nothing_ to do with "location".
>
>You don't say "oh, I have my network card in PCI bus #2, slot #3,
>subfunction #1, so I should do 'ifconfig netp2s3f1'". Right?
>
>The location of the device is _meaningless_.

I *like* eth0..n (I'd like net0..n better). And I *can't* ask what 
eth0 and eth1 are, by the way, but I should be able to (Jeff Garzik 
has proposed an extension to ethtool to help out this lack, but it's 
not in Linux today, and needs concrete implementation anyway).

But that's not my point. I'm *not* proposing that we exchange eth0 
for geographic names. I'm suggesting, though, that the location of 
the device is *not* meaningless, because it's the physically-located 
RJ45 socket (or whatever) that I have to connect a particular cable 
to. Sure, no big deal for systems with a single connection, but it 
becomes a real pain when you've got a dozen, which is a reasonable 
number for some network-infrastructure functions (eg firewalls).

When I ifconfig one of a collection of interfaces, I'm very much 
talking about the specific physical interface connected via a 
specific physical cable to a specific physical switch port.

Bob Glamm  is on the right track with

At 5:35 PM -0500 2001-05-15, Bob Glamm wrote:
>   # start up networking
>   for i in eth0 eth1 eth2; do
>   identify device $i
>   get configuration/config procedure for device $i identity
>   configure $i
>   done

...it's just that right now the connection between eth* and its 
physical identity isn't made.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 4:35 PM -0700 2001-05-15, David Brownell wrote:
>[ Re why "physical" device IDs _should_ have a critical role in sysadmin ]
>
>>  I would have to agree that "stable" is critical to not driving people
>>  crazy.  In the case of AIX, once a device is enumerated, it will retain
>>  the same name across reboots.  Enough information is kept about each
>>  device to determine if it has already been enumerated (i.e. same I/O
>>  port address for serial devices, MAC address for ethernet cards, etc),
>>  or if it is a new device and should get a new name.
>
>I caught those refs to how AIX does this ... sounds worth learning from.
>Does it handle USB "port addresses" (which bus and hub)?

Solaris has a scheme that addresses the issue at well. Device nodes 
live in /devices (/dev has soft links into /devices) and have 
system-global-geographic names. In Solaris talk, the 0-1-2 of 
eth0-1-2 i an instance. There's a file /etc/pathtoinst that records 
the connection of an device instance to its /devices geographical 
name.

It does keep naming stable, but can be a PITA at times when you're 
reconfiguring a system and *want* to renumber things. (There are 
magic ways to do it, though).

That's all Solaris 2.6; not sure about 2.8.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 11:15 AM -0700 2001-05-15, Linus Torvalds wrote:
>The part I absolutely detest is when the information becomes more than
>just "information", and is used to enforce a world-view. Anybody who uses
>physical location for naming devices (ie you have to know where the hell
>the thing is in order to look it up), is so far out to lunch that it's not
>even funny. And the sad fact is that this is pretty much how ALL unixes
>have historically done things ("Oh, you want to see the disk? Sure. It's
>on scsi bus 1, channel 2, ID 3, lun 0, so you just open /dev/s1c3l0 and
>you're done! Easy as pie!").
>
>Keep it informational. And NEVER EVER make it part of the design.

What about:

1 (network domain). I have two network interfaces that I connect to 
two different network segments, eth0 & eth1; they're ifconfig'd to 
the appropriate IP and MAC addresses. I really do need to know 
physically which (physical) hole to plug my eth0 cable into. 
(Extension: same situation, but it's a firewall and I've got 12 ports 
to connect.) (Extension #2: if I add a NIC to the system and reboot, 
I'd really prefer that the NICs already in use didn't get renumbered.)

2 (disk domain). I have multiple spindles on multiple SCSI adapters. 
I want to allocate them to more than one RAID0/1/5 set, with the 
usual considerations of putting mirrors on different adapters, 
spreading my RAID5 drives optimally, ditto stripes. I need (eg) SCSI 
paths to config all this, and I further need real physical locations 
to identify failed drives that need to be hot-replaced. The mirror 
members will move around as drives are replaced and hot spares come 
into play.

Seems like more that merely informational.

(A side observation: PCI or SCSI bus/device/lun/etc paths are not 
physical locations; you also need external hardware-specific 
knowledge to be able to talk about real physical locations in a way 
that does the system operator any good.)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Remove silly beep macro from pgtable.h

2001-05-15 Thread Jonathan Lundell

At 7:36 PM +0200 2001-05-15, Mike Galbraith wrote:
>On Tue, 15 May 2001, Jeff Golds wrote:
>
>>  Hi folks,
>>
>>  Found this bit of unused code in the i386 and sh architectures. 
>>As it's not being used, let's get rid of it.  Also, pgtable.h seems 
>>to be an odd place for this.
>
>I'd leave it.. folks with early boot troubles might find it useful.
>
>   -Mike

Consider small rant about literal IO references to magic locations 
hereby ranted. Especially in header files completely unrelated to the 
IO function in question.

-#define __beep() asm("movb $0x3,%al; outb %al,$0x61")

Let's please not assume that every i386 implementation has a full set 
of legacy PC IO hardware.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Remove silly beep macro from pgtable.h

2001-05-15 Thread Jonathan Lundell

At 7:36 PM +0200 2001-05-15, Mike Galbraith wrote:
On Tue, 15 May 2001, Jeff Golds wrote:

  Hi folks,

  Found this bit of unused code in the i386 and sh architectures. 
As it's not being used, let's get rid of it.  Also, pgtable.h seems 
to be an odd place for this.

I'd leave it.. folks with early boot troubles might find it useful.

   -Mike

Consider small rant about literal IO references to magic locations 
hereby ranted. Especially in header files completely unrelated to the 
IO function in question.

-#define __beep() asm(movb $0x3,%al; outb %al,$0x61)

Let's please not assume that every i386 implementation has a full set 
of legacy PC IO hardware.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 11:15 AM -0700 2001-05-15, Linus Torvalds wrote:
The part I absolutely detest is when the information becomes more than
just information, and is used to enforce a world-view. Anybody who uses
physical location for naming devices (ie you have to know where the hell
the thing is in order to look it up), is so far out to lunch that it's not
even funny. And the sad fact is that this is pretty much how ALL unixes
have historically done things (Oh, you want to see the disk? Sure. It's
on scsi bus 1, channel 2, ID 3, lun 0, so you just open /dev/s1c3l0 and
you're done! Easy as pie!).

Keep it informational. And NEVER EVER make it part of the design.

What about:

1 (network domain). I have two network interfaces that I connect to 
two different network segments, eth0  eth1; they're ifconfig'd to 
the appropriate IP and MAC addresses. I really do need to know 
physically which (physical) hole to plug my eth0 cable into. 
(Extension: same situation, but it's a firewall and I've got 12 ports 
to connect.) (Extension #2: if I add a NIC to the system and reboot, 
I'd really prefer that the NICs already in use didn't get renumbered.)

2 (disk domain). I have multiple spindles on multiple SCSI adapters. 
I want to allocate them to more than one RAID0/1/5 set, with the 
usual considerations of putting mirrors on different adapters, 
spreading my RAID5 drives optimally, ditto stripes. I need (eg) SCSI 
paths to config all this, and I further need real physical locations 
to identify failed drives that need to be hot-replaced. The mirror 
members will move around as drives are replaced and hot spares come 
into play.

Seems like more that merely informational.

(A side observation: PCI or SCSI bus/device/lun/etc paths are not 
physical locations; you also need external hardware-specific 
knowledge to be able to talk about real physical locations in a way 
that does the system operator any good.)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 4:35 PM -0700 2001-05-15, David Brownell wrote:
[ Re why physical device IDs _should_ have a critical role in sysadmin ]

  I would have to agree that stable is critical to not driving people
  crazy.  In the case of AIX, once a device is enumerated, it will retain
  the same name across reboots.  Enough information is kept about each
  device to determine if it has already been enumerated (i.e. same I/O
  port address for serial devices, MAC address for ethernet cards, etc),
  or if it is a new device and should get a new name.

I caught those refs to how AIX does this ... sounds worth learning from.
Does it handle USB port addresses (which bus and hub)?

Solaris has a scheme that addresses the issue at well. Device nodes 
live in /devices (/dev has soft links into /devices) and have 
system-global-geographic names. In Solaris talk, the 0-1-2 of 
eth0-1-2 i an instance. There's a file /etc/pathtoinst that records 
the connection of an device instance to its /devices geographical 
name.

It does keep naming stable, but can be a PITA at times when you're 
reconfiguring a system and *want* to renumber things. (There are 
magic ways to do it, though).

That's all Solaris 2.6; not sure about 2.8.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 1:18 PM -0700 2001-05-15, Linus Torvalds wrote:
   1 (network domain). I have two network interfaces that I connect to
  two different network segments, eth0  eth1;

So?

Informational. You can always ask what eth0 and eth1 are.

There's another side to this: repeatability. A setup should be
_repeatable_.

This is what we have now. Network devices are called eth0..N, and nobody
is complaining about the fact that the numbering is basically random. It
is _repeatable_ as long as you don't change your hardware setup, and the
numbering has effectively _nothing_ to do with location.

You don't say oh, I have my network card in PCI bus #2, slot #3,
subfunction #1, so I should do 'ifconfig netp2s3f1'. Right?

The location of the device is _meaningless_.

I *like* eth0..n (I'd like net0..n better). And I *can't* ask what 
eth0 and eth1 are, by the way, but I should be able to (Jeff Garzik 
has proposed an extension to ethtool to help out this lack, but it's 
not in Linux today, and needs concrete implementation anyway).

But that's not my point. I'm *not* proposing that we exchange eth0 
for geographic names. I'm suggesting, though, that the location of 
the device is *not* meaningless, because it's the physically-located 
RJ45 socket (or whatever) that I have to connect a particular cable 
to. Sure, no big deal for systems with a single connection, but it 
becomes a real pain when you've got a dozen, which is a reasonable 
number for some network-infrastructure functions (eg firewalls).

When I ifconfig one of a collection of interfaces, I'm very much 
talking about the specific physical interface connected via a 
specific physical cable to a specific physical switch port.

Bob Glamm  is on the right track with

At 5:35 PM -0500 2001-05-15, Bob Glamm wrote:
   # start up networking
   for i in eth0 eth1 eth2; do
   identify device $i
   get configuration/config procedure for device $i identity
   configure $i
   done

...it's just that right now the connection between eth* and its 
physical identity isn't made.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: LANANA: To Pending Device Number Registrants

2001-05-15 Thread Jonathan Lundell

At 9:34 PM -0400 2001-05-15, Nicolas Pitre wrote:
On Wed, 16 May 2001, Daniel Phillips wrote:

  On Tuesday 15 May 2001 23:20, Nicolas Pitre wrote:
   Personally, I'd really like to see /dev/ttyS0 be the first detected
   serial port on a system, /dev/ttyS1 the second, etc.

  There are well-defined rules for the first four on PC's.  The ttySx
  better match the labels the OEM put on the box.

Then just make them be detected first.

Well, they traditionally start with 1, not 0, too. Or have cute 
little icons and no text. Or aren't labelled at all. I'm using one 
fairly well-known dual-port PCI serial board that silently 
interchanged the two ports on a rev change, with no labelling change 
at all ('cause there was no label!). Make your ttySx match *that*!

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Not a typewriter

2001-05-13 Thread Jonathan Lundell

>why creat doesn't end in an "e;" and so forth.  I tell the

Some time back, Ken Thompson was asked, if he had it to do over 
again, what changes he would make to Unix. The only thing he could 
think of: spell it "create()".
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 5:45 PM +0100 2001-05-13, Alan Cox wrote:
>  > What I was arguing (conceptually) is that something like
>>  #define ENOIOCTLCMD ENOTTY
>>  or preferably but more invasively s/ENOIOCTLCMD/ENOTTY/ (mutatis mutandis)
>>
>>  would result in no loss of function. I assert that ENOIOCTLCMD is
>>  redundant, pending a specific counterexample.
>
>On the contrary. I can now no longer force an unsupported response when there
>is a generic routine I dont wish to use

That makes sense. Thanks.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 5:43 PM +0100 2001-05-12, Alan Cox wrote:
>  > That's what's confusing me: why the distinction? It's true that the
>>  current scheme allows the dev->ioctlfunc() call below to force ENOTTY
>>  to be returned, bypassing the switch, but presumably that's not what
>>  one wants.
>
>It allows driver specific code to override generic code, including 
>by reporting
>that a given feature is not available/appropriate.
>
>Alan

What I was arguing (conceptually) is that something like

#define ENOIOCTLCMD ENOTTY

or preferably but more invasively s/ENOIOCTLCMD/ENOTTY/ (mutatis mutandis)

would result in no loss of function. I assert that ENOIOCTLCMD is 
redundant, pending a specific counterexample.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 3:27 PM -0700 2001-05-12, Shane Wegner wrote:
>  >int err = dev->ioctlfunc(dev, op, arg);
>>  if( err != -ENOIOCTLCMD)
>>  return err;
>>
>>  /* Driver specific code does not support this ioctl */
>
>I noticed this return coming out of the watchdog driver a
>while ago when I was playing with it.  I have taken a quick
>look and it seems a few drivers do return this directly to
>userspace.  I'm not sure if this is complete but ...

Can't this be handled in sys_ioctl()? At the very end, replace

out:
return error;

with

out:
return (error == -ENOIOCTLCMD) ? -ENOTTY : error;


>diff -ur linux-2.4.4-ac8/drivers/block/swim3.c linux/drivers/block/swim3.c
>--- linux-2.4.4-ac8/drivers/block/swim3.c  Sat May 12 14:59:44 2001
>+++ linux/drivers/block/swim3.cSat May 12 15:22:30 2001
>@@ -848,7 +848,7 @@
>  sizeof(struct floppy_struct));
>   return err;
>   }
>-  return -ENOIOCTLCMD;
>+  return -ENOTTY;
>  }


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 3:27 PM -0700 2001-05-12, Shane Wegner wrote:
  int err = dev-ioctlfunc(dev, op, arg);
  if( err != -ENOIOCTLCMD)
  return err;

  /* Driver specific code does not support this ioctl */

I noticed this return coming out of the watchdog driver a
while ago when I was playing with it.  I have taken a quick
look and it seems a few drivers do return this directly to
userspace.  I'm not sure if this is complete but ...

Can't this be handled in sys_ioctl()? At the very end, replace

out:
return error;

with

out:
return (error == -ENOIOCTLCMD) ? -ENOTTY : error;


diff -ur linux-2.4.4-ac8/drivers/block/swim3.c linux/drivers/block/swim3.c
--- linux-2.4.4-ac8/drivers/block/swim3.c  Sat May 12 14:59:44 2001
+++ linux/drivers/block/swim3.cSat May 12 15:22:30 2001
@@ -848,7 +848,7 @@
  sizeof(struct floppy_struct));
   return err;
   }
-  return -ENOIOCTLCMD;
+  return -ENOTTY;
  }


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 5:43 PM +0100 2001-05-12, Alan Cox wrote:
   That's what's confusing me: why the distinction? It's true that the
  current scheme allows the dev-ioctlfunc() call below to force ENOTTY
  to be returned, bypassing the switch, but presumably that's not what
  one wants.

It allows driver specific code to override generic code, including 
by reporting
that a given feature is not available/appropriate.

Alan

What I was arguing (conceptually) is that something like

#define ENOIOCTLCMD ENOTTY

or preferably but more invasively s/ENOIOCTLCMD/ENOTTY/ (mutatis mutandis)

would result in no loss of function. I assert that ENOIOCTLCMD is 
redundant, pending a specific counterexample.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-13 Thread Jonathan Lundell

At 5:45 PM +0100 2001-05-13, Alan Cox wrote:
   What I was arguing (conceptually) is that something like
  #define ENOIOCTLCMD ENOTTY
  or preferably but more invasively s/ENOIOCTLCMD/ENOTTY/ (mutatis mutandis)

  would result in no loss of function. I assert that ENOIOCTLCMD is
  redundant, pending a specific counterexample.

On the contrary. I can now no longer force an unsupported response when there
is a generic routine I dont wish to use

That makes sense. Thanks.
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Not a typewriter

2001-05-13 Thread Jonathan Lundell

why creat doesn't end in an e; and so forth.  I tell the

Some time back, Ken Thompson was asked, if he had it to do over 
again, what changes he would make to Unix. The only thing he could 
think of: spell it create().
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-12 Thread Jonathan Lundell

At 12:16 PM +0100 2001-05-12, Alan Cox wrote:
>  > Can somebody explain the use of ENOIOCTLCMD? There are order of 170
>>  uses in the kernel, but I don't see any guidelines for that use (nor
>>  what prevents it from being seen by user programs).
>
>It should never be seen by apps. If it can be then it is wrong code.
>Basically you use it in things like

I was surmising something like that, but in that case aren't 
ENOIOCTLCMD and ENOTTY redundant? That is, could not every occurrence 
of ENOIOCTLCMD be replaced by ENOTTY with no change in function? 
That's what's confusing me: why the distinction? It's true that the 
current scheme allows the dev->ioctlfunc() call below to force ENOTTY 
to be returned, bypassing the switch, but presumably that's not what 
one wants.

>   int err = dev->ioctlfunc(dev, op, arg);
>   if( err != -ENOIOCTLCMD)
>   return err;
>
>   /* Driver specific code does not support this ioctl */
>
>   switch(op)
>   {
>
>   ...
>   default:
>   return -ENOTTY;
>   }
>
>Its a way of passing back 'you handle it'
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [EMAIL PROTECTED]
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ENOIOCTLCMD?

2001-05-12 Thread Jonathan Lundell

At 12:16 PM +0100 2001-05-12, Alan Cox wrote:
   Can somebody explain the use of ENOIOCTLCMD? There are order of 170
  uses in the kernel, but I don't see any guidelines for that use (nor
  what prevents it from being seen by user programs).

It should never be seen by apps. If it can be then it is wrong code.
Basically you use it in things like

I was surmising something like that, but in that case aren't 
ENOIOCTLCMD and ENOTTY redundant? That is, could not every occurrence 
of ENOIOCTLCMD be replaced by ENOTTY with no change in function? 
That's what's confusing me: why the distinction? It's true that the 
current scheme allows the dev-ioctlfunc() call below to force ENOTTY 
to be returned, bypassing the switch, but presumably that's not what 
one wants.

   int err = dev-ioctlfunc(dev, op, arg);
   if( err != -ENOIOCTLCMD)
   return err;

   /* Driver specific code does not support this ioctl */

   switch(op)
   {

   ...
   default:
   return -ENOTTY;
   }

Its a way of passing back 'you handle it'
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



ENOIOCTLCMD?

2001-05-11 Thread Jonathan Lundell

Can somebody explain the use of ENOIOCTLCMD? There are order of 170 
uses in the kernel, but I don't see any guidelines for that use (nor 
what prevents it from being seen by user programs).

Thanks.

errno.h:

>/* Should never be seen by user programs */
>#define ERESTARTSYS512
>#define ERESTARTNOINTR 513
>#define ERESTARTNOHAND 514 /* restart if no handler.. */
>#define ENOIOCTLCMD515 /* No ioctl command */

-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



  1   2   >