Re: [PATCH 2.6.13-rc6] dcdbas: add Dell Systems Management Base Driver with sysfs support

2005-08-25 Thread Michael E Brown
Please download libsmbios 0.10.0-beta1 and send the "dumpCmos" output
from your machine. Please send it to the libsmbios devel mailing list.
>From that output, I can tell you if this token is available on that
machine. If that token is available, then yes, you can set that feature.

libsmbios can be obtained from http://linux.dell.com/libsmbios/main/

dumpCmos is part of "make minimal", so you don't need any other libs
present to compile. It is found under bins/output/ after compile.
(alternatively, install the libs and bins rpms)
--
Michael

On Thu, 2005-08-25 at 14:44 +0100, David Greaves wrote:
> 
> I have a Dell SC420
> Is there a way (based around this patch) to allow users to enable and
> set the auto-power-on BIOS feature?
> (ie tell the BIOS to power on at 3:40am, power the system down, watch
> it
> power up at 3:40am)
> 
> Normally I'd use 'nvram-wakeup' but it dosen't understand the Dell
> BIOS.
> 
> If so what I'd _like_ to do is send a patch to nvram-wakeup that tests
> for this capability and uses it if it's there.
> 
> David
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-16 Thread Michael E Brown
On Wed, 2005-08-17 at 02:23 +0200, Andi Kleen wrote:
> <[EMAIL PROTECTED]> writes:
> > 2) Dell OpenManage
> > The main use of this driver by openmanage will be to read the System 
> > Event Log that BIOS keeps. Here are some other random relevant points:
> 
> Are there machine check events from the last boot in that event log? 

I don't know. Either Doug or Abhay may, though. If they don't I can ask
the BIOS guys.

--
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-16 Thread Michael E Brown
On Tue, 2005-08-16 at 01:16 -0700, Greg KH wrote:
> On Mon, Aug 15, 2005 at 10:10:28PM -0500, Michael E Brown wrote:
> > To take a concrete example, I suggested to Doug to mention fan status. I
> > get the feeling that you possibly think that this would be better
> > integrated into lmsensors, or something like that.
> 
> Yes it should.  That way you get the benifit of all userspace
> applications that already use the lmsensors library without having to be
> rewritten in order to support your new library.

Little did I know when I first mentioned it how big of a mistake it
would be to mention the sensor functions. *sigh*

The dcdbas driver allows access to all of the Dell SMIs. Sensors are
only one instance of SMI code (only two functions, in fact, if I am
reading this spec correctly). The other (roughly) 58 functions have
nothing to do with sensors. The presence of the dcdbas driver would not
stop anybody from writing another driver to provide a hwmon interface to
just the sensors pieces. 

This isn't like a PCI device where there can be only one driver. With
the dcdbas driver in place, other drivers could also be written to
provide subsystem-specific interfaces to the same data, such as hwmon.
There isn't anything in dcdbas that would conflict with or lock out
anybody from creating another driver that provides access to the same
data.

> 
> > That really isn't the case, as lmsensors is really geared towards
> > bit-banging lm81 (for example) chips to get fan status.
> 
> Not true at all.  It is geared toward providing a common userspace
> interface for all sensor information in the system.  Now if it provides
> this in a good and easy to use way is another story...
> 
> But anyway, there is a standard way to export fan speed and temperature
> information from the kernel, the hwmon interface (see -mm for examples
> and documentation, and the i2c stuff in mainline today.)

I don't really know a bunch about lmsensors. I just downloaded it and
started poking around. I would have thought, though, that they would
provide an easy way to provide a userspace library method of extending
for new sensors. I suppose I was wrong here as I don't see such
functionality on first glance.

> 
> > In our case, we have a standardized BIOS interface to get this info,
> > and that standardized method involves SMI and not bit-banging
> > interfaces. Once this driver is accepted into the kernel, we can go
> > and add support in the _userspace_ lmsensors libs to poll fan and temp
> > using this driver.
> 
> No, export this data properly through sysfs like all other temperature
> and sensor data is.  Don't create a new one, no matter how much you
> would like to keep from changing kernel code in the future for new
> hardware.

This driver is not trying to create a new way to do sensor and monitor
data. This just happens to be a side effect of the main use case.

> 
> > For example, we already have at least one buggy implementation of this
> > exact stack in the kernel as the i8k driver. The i8k driver was reverse-
> > engineered and works, but it does not follow the spec at all, and so is
> > subject to major breakage if the BIOS changes. With dcdbase + libsmbios,
> > we can write this _correctly_, and in such a way that it follows the
> > spec and will not break on BIOS updates.
> 
> No, fix the i8k driver as you have access to the specs.  It was there
> first.

Ok.

> 
> > What you are asking us to do is just not feasible on many levels. First,
> > just from the number of functions we would have to implement. We would
> > have to implement about 60 new sysfs files, with at least 120 separate
> > functions for read/write.
> 
> No problem at all, we can create that with only 2 read/write functions.
> See the i2c code for details.

file/line#? I did a quick search and didn't see anything special.

My main point here is that each SMI call would require it's own kernel-
space parsing of parameter and return values, as each call has different
argument passing requirements.

> 
> > Each function would have to take into account the specific calling
> > requirements of that specific function.
> 
> Again, no different from any other sensor driver.

Again, this driver is not a sensor driver. 

> 
> > Then, we would have to implement all of the bugfixes and
> > platform-specific workarounds in the kernel for each of those
> > functions for each Dell platform.
> 
> Yup.
> 
> > Each time another function is added to BIOS, we would have to go out
> > and patch everybody's kernel to support the new function.
> 
> Yup.  I suggest you complain to the bios people about this horrible way
> to design hardware then :)

You have a

Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
On Tue, 2005-08-16 at 01:36 -0400, [EMAIL PROTECTED] wrote:
> On Mon, 15 Aug 2005 23:58:43 CDT, Michael E Brown said:
> 
> > No, this is an _EXCELLENT_ reason why _LESS_ of this should be in the
> > kernel. Why should we have to duplicate a _TON_ of code inside the
> > kernel to figure out which platform we are on, and then look up in a
> > table which method to use for that platform? We have a _MUCH_ nicer
> > programming environment available to us in userspace where we can use
> > things like libsmbios to look up the platform type, then look in an
> > easily-updateable text file which smi type to use. In general, plugging
> > the wrong value here is a no-op.
> 
> You'll still need to do some *very* basic checking - there's fairly
> scary-looking 'outb' call in  callintf_smi()  and host_control_smi() that 
> seems to
> be totally too trusting that The Right Thing is located at address 
> CMOS_BASE_PORT:

Ok, very nice. Finally some actual code review, thanks. :-)

> 
> + for (index = PE1300_CMOS_CMD_STRUCT_PTR;
> +  index < (PE1300_CMOS_CMD_STRUCT_PTR + 4);
> +  index++) {
> + outb(index,
> +  (CMOS_BASE_PORT + CMOS_PAGE2_INDEX_PORT_PIIX4));
> + outb(*data++,
> +  (CMOS_BASE_PORT + CMOS_PAGE2_DATA_PORT_PIIX4));
> + }
> 
> This Dell C840 has an 845, not a PIIX.  What just got toasted if this driver
> gets called?

These are all just standard CMOS port numbers that pretty much every
chipset uses to access CMOS.

If you have not got a PE1300, the worst that happens is you have some
random bits scribbled into your CMOS. Nothing that that "cmos clear"
jumper won't fix. :-)

Seriously, this file is meant to be activated by a userspace program
that looks up the system ID in the appropriate table and writes the
correct value into the driver. I'm not sure there is much more to be
said than "don't do that." The official Dell management stack will
always ensure that this is set correctly. If you don't use the official
Dell stack, libsmbios is available, and we would be happy to make the
appropriate tables available there. If you don't use either of these and
go fiddling with this, I'm not sure there is much more to be done.

> 
> Can we have a check that the machine is (a) a Dell and (b) has a PIIX and (c) 
> the
> PIIX has a functional SMI behind it, before we start doing outb() calls?
> 
> 

I'll have to defer to Doug on this. It may be possible to arrange this.
--
Michael


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
On Mon, 2005-08-15 at 22:35 -0700, Chris Wedgwood wrote:
> On Tue, Aug 16, 2005 at 12:19:49AM -0500, Michael E Brown wrote:
> 
> > Hmm... did I mention libsmbios? :-)
> > http://linux.dell.com/libsmbios/main.
> 
> I'm aware of it --- it seems pretty limited right now and I'm still
> irked Dell isn't more forthcoming with documentation.

I cannot give docs, but I can retype the docs into code or xml files.
What, specifically, were you looking for?

I intend to make an XML file of all of the token name, id, and
description mappings within the next couple of weeks. This should pretty
much document all of the token mappings. 

Next thing would be to do something for the SMI calls. What I will do
there is basically just put a big table and make a C-API available for
every SMI call we support.

> Given that why not resubmit the kernel driver when the userspace
> becomes usable for people without them having to use MonsterApp from
> Dell?

Well, there are three different groups involved here. I didn't write the
dcdbas code, Doug did. I just reviewed it and decided it would be nice
to implement in libsmbios. I started work on the libsmbios side of
things this weekend. I didn't know that Doug had reposted the driver to
linux-kernel until about 4pm this afternoon. :-(

Libsmbios isn't the only user of dcdbas. That is the third group.
(MonsterApp, so nicely put...)

When I found out Doug reposted the driver, I went into overdrive trying
to finish out libsmbios. But, basically, libsmbios is a one-person
project, and that person would be me. And I have a "real" job to do
besides just libsmbios. :-)  The best I can guarantee is next week,
although if my manager is understanding, I may have it done sooner. :-)

> 
> > Aside from that, for the most part, the only thing SMI ever does is
> > pass buffers back and forth.
> 
> I meant to ask; does this have horrible latency or nasties like lots
> of laptop SMM stuff?

That really depends, I guess. The hugely horrible laptop SMM stuff
mostly has to do with the battery gauge. The reason that the battery
stuff takes so long is that they basically do an entire current
measurement and computation of the battery each time the SMI is called
and do not (and pretty much cannot) cache anything from call to call.
Compounding things, they have to talk to the battery over a very slow
serial link. (as related to me by a former BIOS engineer)

I haven't done any measurements on servers, but I bet that most of it
isn't anywhere near as bad as the laptop stuff.
--
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
On Tue, 2005-08-16 at 01:17 -0400, [EMAIL PROTECTED] wrote:
> On Mon, 15 Aug 2005 23:09:28 CDT, you said:
> 
> > No, dcdbas has nothing to do with this. I'll have to submit a patch
> > against the docs. The program you need to use already exists and is
> > open source. You can use libsmbios to do this.
> > http://linux.dell.com/libsmbios/main.
> 
> Now I'm confoozled.  Maybe - I suspect we're actually in violent agreement...

nope... :-)

> 
> On Mon, 15 Aug 2005 17:58:56 CDT, [EMAIL PROTECTED] said:
> > Additionally, we are releasing an open source library (GPL/OSL dual 
> > license) that can use these hooks to perform many systems management 
> > functions in userspace. See http://linux.dell.com/libsmbios/main/. We 
> > should have code in libsmbios to do SMI using this driver within about two 
> > weeks.  We currently writing the SMI hooks in libsmbios using this posted 
> > version of the driver. I am the maintainer of this project, and it is my 
> > goal 
> > to have code in libsmbios for every Dell SMI call.
> 
> So dcdbas *is* intended as the kernel end of the userspace libsmbios, which
> is the suggested way of getting that BIOS updated. OK, I got it now.. ;)

not quite... :-)

Basically, for the exact case of RBU, libsmbios _today_ has what is
necessary to support this, without using dcdbas.

Today, libsmbios can set certain CMOS bits. _Some_ of the BIOS F2 screen
options are represented in CMOS as bits. Also, other features are made
available through CMOS that are not available through F2, and all of
these bits (F2 bits and other bits) can be manipulated by libsmbios. It
just so happens that RBU is implemented using a CMOS bit (represented by
token 0x005C and 0x005D). 

The addition of 'dcdbas' driver enables _extra_, _additional_
functionality that libsmbios does not today have. The rest of the BIOS
F2 screen options that are not in CMOS are available through SMI. Also,
lots of other interesting stuff that is not related to BIOS F2 screens
is available through SMI.

To give an example: the Asset tag can be set through CMOS and SMI.
Today, libsmbios can only set asset tag through CMOS. With the addition
of dcdbas, libsmbios can use the SMI method to update asset tag. 

SMI is a more reliable way to set asset tag, as it is dynamic and system
flash is updated right away. Future systems may drop CMOS method
completely as we start to run out of room in CMOS (there are only 256
_bytes_ available in CMOS, remember.) 

Basically, I am positioning libsmbios as an open-source way to take
advantage of all of the features of a Dell system that are available
through the system smbios/dmi table (similar to dmidecode), system cmos,
or through SMI calls.

> 
> (continuing on)
> 
> > The binary you want to use is "activateCmosToken", under bins/output/
> > (after compilation). The command line syntax is like this:
> > activateCmosToken 0x005C
> > 
> > If you want to cancel a BIOS update that has already been activated
> > (per above), use:   
> > activateCmosToken 0x005D
> > 
> > Basically, follow the docs in the RBU docs as far as cat-ing the bios
> > update image to the rbu sysfs files, then use the activateCmosToken
> > program to tell BIOS to do the update on reboot. 
> 
> Ahh... the missing piece I didn't have before. :)

I provided this info to Abhay when he posted RBU, and I thought he had
already updated the rbu docs with this info. I suppose I should have
checked. :-(

--
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
Again, please cc Doug and I on replies...

Kyle Moffett wrote:
>On Aug 16, 2005, at 00:34:51, Chris Wedgwood wrote:
>> On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
>>> Why can't you just implement the system management actions in the
>>> kernel driver?
>>
>> Why put things in the kernel unless it's really needed?

Thank you. Our sentiments exactly.

>>
>> I'm not thrillied about the lack of userspace support for this driver
>> but that still doesn't mean we need to shovel wads of crap into the
>> kernel.

Hmm... did I mention libsmbios? :-)
http://linux.dell.com/libsmbios/main.

SMI support is not yet implemented inside libsmbios, but I am working
feverishly on it (in-between emails to linux-kernel, of course.) :-) We
have a development mailing list, and I will make the announcement there
when it has been complete. I will also be submitting patches to the RBU
documentation and dcdbas documentation pointing to libsmbios for folks
that want a 100% open-source method of using these drivers.

I cannot (at this point, I'm working on it, though), provide our
internal documentation of our SMI implementation directly. But, I am
authorized to add all of this to libsmbios, and I intend to very
clearly document all of the SMI calls in libsmbios. 

>
> I'm worried that it might be more of a mess in userspace than it  
> could be
> if done properly in the kernel.  Hardware drivers, especially for  
> something
> as critical as the BIOS, should probably be done in-kernel.  Look at the
> mess that X has become, it mmaps /dev/mem and pokes at the PCI busses
> directly.  I just don't want an MSI-driver to become another /dev/mem.

Again, this is a whole different thing from a video card driver. The
SMI driver consists of one instruction: "outb magic_port#,
magic_value;", with the simple addition that EBX contain the
physical address of buffer that holds the requested command code and the
return values.

There isn't really a whole lot more than that. For the Dell SMI, you
have to look up the magic port # and magic value in smbios,
specifically, there is a vendor structure 0xDA with a specific layout
(which will be documented in libsmbios) that specifies the magic port
and value.

Aside from that, for the most part, the only thing SMI ever does is
pass buffers back and forth. 

-- 
Michael



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
I am not subscribed to linux-kernel. Please cc me (and Doug) on all
replies. Sorry if I'm breaking peoples threading, but I am cut-and-
pasting this from web archives, since this wasn't cc-d to me
originally. 

>On Aug 15, 2005, at 19:38:49, Doug Warzecha wrote:
>> On Mon, Aug 15, 2005 at 04:23:37PM -0400, Kyle Moffett wrote:
>>> Why can't you just implement the system management actions in the  
>>> kernel
>>> driver?
>>
>> We want to minimize the amount of code in the kernel and avoid  
>> having to
>> update the driver each time a new system management command is added.
>
> One of the recent trends in kernel driver development is to make as much
> as possible accessible through standard tools (like with echo and cat  
> via sysfs).

Where it makes sense. Everything can be taken too far, and I believe
that you are taking this past the point of sanity. Are you also to
advocate that X stop mmap()-ing /dev/mem to manipulate PCI memory-space
of the video cards, but rather we should have a kernel driver that
makes every register of each PCI card available as a file in sysfs so
that we can re-write X as a big bash script? Let me know how that works
out.

>
>> The libsmbios project is being updated to use this code.  http:// 
>> linux.dell.com/libsmbios/main/.  Using the libsmbios code, you
>> will be able to set all of the options in BIOS F2 screen from Linux
>> userspace.  Also, libsmbios is looking at implementing a few other  
>> things
>> like fan status.  Libsmbios is 100% open-source (OSL/GPL dual  
>> license).
>
>  From my point of view, this driver could use sysfs almost entirely  
> and put
> all of the hardware-manipulation code completely in kernel space, along
> with the hardware detection code.  You could have plain-text files in
> /sys/bus/platform/dellbios/ that have all of the BIOS F2 options  
> accessible
> to the admin from the command line, without special tools.  (You could
> always add an extra program that presents a BIOS-like interface)

Conservatively counting, I see just about 350 different BIOS options in
my current list, plus about 60 different (unrelated) SMI calls. We are
talking about several tens of thousands of lines of code in the kernel
to surface each of these in the kernel along with all of the necessary
BIOS-bug-workaround and platform detection code. This is not pretty,
nor easy code. I, personally, do not want to be responsible for the
parsing bug that causes a root hole here. 

In userspace, I can easily stick all of the cross-references into an
XML file, along with the workarounds and bug-fixes, which makes the
code a bit tighter. We have one project here at Dell that implemented
an all C (userspace) equivalent of what you are talking about, and they
ended up writing a code generation script that took XML definitions of
each option and generated the resulting C code. They still ended up
with a huge bucketload of code. We don't have the same conveniences in
kernel-land. All the nice toys are userspace.

>> The method of generating a host control SMI is not exactly the same  
>> for
>> each PowerEdge system listed in dcdbas.txt.  host_control_smi_type  
>> tells
>> the driver how to generate the host control SMI for the system in use.
>> I'll update dcdbas.txt with the SMI type value associated with the  
>> systems
>> listed in that file.
>
> This is an _excellent_ reason why more of this should be in the kernel.
> What happens if the wrong SMI is used?  Shouldn't it be relatively easy
> for the kernel to determine the correct SMI itself?

No, this is an _EXCELLENT_ reason why _LESS_ of this should be in the
kernel. Why should we have to duplicate a _TON_ of code inside the
kernel to figure out which platform we are on, and then look up in a
table which method to use for that platform? We have a _MUCH_ nicer
programming environment available to us in userspace where we can use
things like libsmbios to look up the platform type, then look in an
easily-updateable text file which smi type to use. In general, plugging
the wrong value here is a no-op.

What you are advocating is that we bloat the kernel beyond belief just
so you can use echo and cat. I thought that we were trying to remove
extra stuff from the kernel. I thought this was the reasoning behind
initramfs and things like irqbalanced.

-- 
Michael


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
>On Mon, 15 Aug 2005 18:38:49 CDT, Doug Warzecha said:
>
>> > If this is supposed to be used with the RBU code to trigger a BIOS  
>> > update, ...
>> 
>> This driver is not needed by the RBU code.
>
> Documentation/dell_rbu.txt says:
>
>> The rbu driver needs to have an application which will inform the BIOS to
>> enable the update in the next system reboot.
>
> Can the dcdbas code be used to implement that application?

No, dcdbas has nothing to do with this. I'll have to submit a patch
against the docs. The program you need to use already exists and is
open source. You can use libsmbios to do this.
http://linux.dell.com/libsmbios/main.

The binary you want to use is "activateCmosToken", under bins/output/
(after compilation). The command line syntax is like this:
activateCmosToken 0x005C

If you want to cancel a BIOS update that has already been activated
(per above), use:   
activateCmosToken 0x005D

Basically, follow the docs in the RBU docs as far as cat-ing the bios
update image to the rbu sysfs files, then use the activateCmosToken
program to tell BIOS to do the update on reboot. 

-- 
Michael




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2.6.13-rc6] add Dell Systems Management Base Driver (dcdbas) with sysfs support

2005-08-15 Thread Michael E Brown
On Mon, 2005-08-15 at 21:29 -0400, Kyle Moffett wrote:
> On Aug 15, 2005, at 18:58:56, [EMAIL PROTECTED] wrote:
> >> Why can't you just implement the system management actions in
> >> the kernel driver?  This is tantamount to a binary SMI hook to
> >> userspace.  What functionality does this provide on a dell
> >> system from an administrator's point of view?
> 
> > The second alternative is not entirely feasible. We have over 60
> > SMI functions, and we would have to write a kernel-mode wrapper for
> > each and every one. I hope you agree that code that doesn't exist is
> > less buggy than code that is, and that code that is in userspace is a
> > whole lot less likely to cause a kernel crash than code that is in
> > the kernel.
> 
> I think the second alternative is actually feasible and preferable. The
> point of the kernel is to provide safe and secure access to two things:
>1) Hardware through an abstraction layer
>2) Software services (like IP stack) that are not feasible to do in
>   userspace.
> 
> A system that just provides a hunk of DMA RAM and the ability to  
> generate

We are only using the DMA allocation API, we are not actually doing DMA
to those addresses. We use the DMA API to easily restrict allocations to
4GB, as has been previously requested, instead of rolling our own
allocation functions.

> interrupts is definitely not 2, and does not really follow the ideal
> behind 1 either.  I gave the firmware example earlier.  There are  
> several
> devices that provide access to update firmware by reading and writing a
> firmware file directly in sysfs, then updating it on reboot if  
> necessary.

But... the firmware loading example is bogus. We already have the Dell
RBU driver for system BIOS updates, and it has been accepted into the
kernel. This driver (dcdbas) has absolutely nothing to do with firmware
loading. I'm confused as to why you have brought up this example again
after Doug just finished saying that dcdbas has nothing to do with
firmware updates.

So, in a sense, we are _ALREADY_ following your advice, having already
split out the firmware driver into it's own driver that sits under the
firmware/ class.

Sorry, but I think you mis-understand the whole point behind this
driver. This _is_ an abstraction.

For instance, if you have 16 journaling file systems in your kernel, it
would make a lot of sense to pull out the common journaling code and
create a separate journaling subsystem in the kernel, much like jbd. It
would then make sense to make people justify adding new journaling
methods to the kernel for a new file system, since there already exists
one journaling abstraction.

But, it only should go so far. Just because it makes sense to
standardize on one journaling layer in the kernel, doesn't mean that it
also makes sense to pull in all of mysql into the kernel.

In our case, we have a whole bunch of unrelated SMI calls to the BIOS
that have absolutely nothing in common except that they use the SMI
calling method. We have abstracted down to the lowest common denominator
of what we can put into the kernel to enable our whole managment stack.
Rather than re-invent the SMI stack for each of these functions, we have
provided an abstraction.

To take a concrete example, I suggested to Doug to mention fan status. I
get the feeling that you possibly think that this would be better
integrated into lmsensors, or something like that. That really isn't the
case, as lmsensors is really geared towards bit-banging lm81 (for
example) chips to get fan status. In our case, we have a standardized
BIOS interface to get this info, and that standardized method involves
SMI and not bit-banging interfaces. Once this driver is accepted into
the kernel, we can go and add support in the _userspace_ lmsensors libs
to poll fan and temp using this driver.

For example, we already have at least one buggy implementation of this
exact stack in the kernel as the i8k driver. The i8k driver was reverse-
engineered and works, but it does not follow the spec at all, and so is
subject to major breakage if the BIOS changes. With dcdbase + libsmbios,
we can write this _correctly_, and in such a way that it follows the
spec and will not break on BIOS updates.

What you are asking us to do is just not feasible on many levels. First,
just from the number of functions we would have to implement. We would
have to implement about 60 new sysfs files, with at least 120 separate
functions for read/write. Each function would have to take into account
the specific calling requirements of that specific function. Then, we
would have to implement all of the bugfixes and platform-specific
workarounds in the kernel for each of those functions for each Dell
platform. Each time another function is added to BIOS, we would have to
go out and patch everybody's kernel to support the new function.

Besides the fact that this is just not a good design, there just isn't
the manpower to maintain all of these in the kernel a

Re: block ioctl to read/write last sector

2001-02-14 Thread Michael E Brown

On Wed, 14 Feb 2001 [EMAIL PROTECTED] wrote:

>
> Maybe. I think that you'll find that these blocks are
> relative to the start of the partition, not relative
> to the start of the disk.
>
> So if you add a 1-block partition that contains the last
> sector of the disk, all should be fine.
>

Ok. Upon re-reading the code in question, I was too hasty in my
assumptions. This might work, so I'll try it. I don't expect it to be
awfully pretty, though :-(

--
Michael Brown
Linux System Group
Dell Computer Corp

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-14 Thread Michael E Brown

On Wed, 14 Feb 2001 [EMAIL PROTECTED] wrote:

>
> So if you add a 1-block partition that contains the last
> sector of the disk, all should be fine.
>

Oh! I didn't get your meaning before. I think I understand now. The
problem with this is that the tests for block writeability are not done on
a per-partition basis. They are done on a whole block device basis. see
fs/block_dev.c in block_read() and block_write(). The following test kills
us:

if (blk_size[MAJOR(dev)])
size = ((loff_t) blk_size[MAJOR(dev)][MINOR(dev)] <<
BLOCK_SIZE_BITS) >> blocksize_bits;
else
size = INT_MAX;
while (count>0) {
if (block >= size)
return written ? written : -ENOSPC;

--
Michael Brown
Linux Systems Group
Dell Computer Corp

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-14 Thread Michael E Brown

On Wed, 14 Feb 2001 [EMAIL PROTECTED] wrote:

> But it changes the idea of odd and even.
> A partition can start on an odd sector.
>

That is orthogonal to the issue that I am trying to solve with my patch.
My code is trying to make it possible to access sectors at the _end_ of
the disk that you cannot access any other way. Example:

Disk with 1001 blocks. Hardware 512-byte sector size. The block layer uses
1024-byte soft blocksize. This means that, at the _end_ of the disk there
is a single sector that represents half of a software sector. The block
layer will not normally let you read or write that sector because it is
not a full sector.

Another example: Disk with 7 blocks (very small disk :-). Hardware
blocksize=512, Block layer uses 4096-byte blocksize. Now you have _three_
hardware blocks at the end of the disk that the block layer will not let
you read or write.

My patch allows an alternate method to access these sectors. My patch has
nothing to do with partitioning.

--
Michael Brown
Linux Systems Group
Dell Computer Corp

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-14 Thread Michael E Brown

On Wed, 14 Feb 2001, David Balazic wrote:

> Michael E Brown ([EMAIL PROTECTED]) worte :
>
> > That has been tried. No, it does not work. :-) Using Scsi-Generic is the
> > only way so far found, but of course, it only works on SCSI drives.
>
> Did you try scsi-emulation on IDE disks ?

I think that scsi-emulation works only for ATAPI devices. CDROMs are
normally ATAPI. HDs are normally just ATA. I don't think that would work,
but I have not tried it, either.
--
Michael Brown
Linux Systems Group
Dell Computer Corp

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-13 Thread Michael E Brown

Martin,

  It looks like the numbers we picked for our respective IOCTLs conflict.
I think I can change mine to the next higher since your patch seems to
have been around longer. What is the general way to deal with these
conflicts?

--
Michael

On 13 Feb 2001, Martin K. Petersen wrote:

> > "Andries" == Andries Brouwer <[EMAIL PROTECTED]> writes:
>
> Andries> Anyway, an ioctl just to read the last sector is too silly.
> Andries> An ioctl to change the blocksize is more reasonable.
>
> I actually sent you a patch implementing this some time ago, remember?
> We need it for XFS...
>
> Patch against 2.4.2-pre3 follows.
>
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-13 Thread Michael E Brown

On Wed, 14 Feb 2001, Manfred Spraul wrote:

> I have one additional user space only idea:
> have you tried raw-io? bind a raw device to the partition, IIRC raw-io
> is always in 512 byte units.

That has been tried. No, it does not work. :-)  Using Scsi-Generic is the
only way so far found, but of course, it only works on SCSI drives.

>
> Probably an ioctl is the better idea, but I'd use absolute sector
> numbers (not relative to the end), and obviously 64-bit sector numbers -
> 2 TB isn't that far away.
>

I was deliberately trying to limit the scope to avoid misuse. This is to
work around a flaw in the current API, not to create a new API. Limiting
access to only those blocks that would normally be inaccessible through
the normal API seemed like the best bet to me.

--
Michael Brown

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: block ioctl to read/write last sector

2001-02-13 Thread Michael E Brown

Hi Andries!

On Tue, 13 Feb 2001 [EMAIL PROTECTED] wrote:

> >   The block device uses 1K blocksize, and will prevent userspace from
> > seeing the odd-block at the end of the disk, if the disk is odd-size.
> >
> >   IA-64 architecture defines a new partitioning scheme where there is a
> > backup of the partition table header in the last sector of the disk. While
> > we can read and write to this sector in the kernel partition code, we have
> > no way for userspace to update this partition block.
>
> Are you sure?

Yes.

The only alternative at this time is to use the scsi-generic tools to
read directly from the scsi-layer. But, of course, this only works with
scsi devices.

>
> There may be no easy, convenient way right now, but
> (without having checked anything) it seems to me
> that you can, also today.

Please go check :-)  I believe my statement stands: You cannot read or
write to odd-sectors at the end of the disk from userspace. (see below for
definition of odd sector...)

> Look at the addpart utility in the util-linux package.
> It will allow you to add a partition disjoint from
> previously existing partitions.
> And since a partition can start on an odd sector,
> this should allow you to also read the last sector.
>
> Do I overlook something?

Yes. The addpart utility just uses the block-layer ioctls to dynamically
add and/or remove partitions. What this is doing is just adjusting the
kernel's idea of what the current partition scheme is. This has _nothing_
to do with actually reading or writing data from the disk.

The ia64 gpt partitioning code defines a partition header at the front of
the disk and at the end of the disk. I definetly have a need to read and
write to these headers.

What this proposed patch does has _nothing_ to do with partitioning :-) It
is _only_ to read and write the last sector of the disk. It just so
happens that the reason that I have to read that last sector is to read a
partition header.

>
> Anyway, an ioctl just to read the last sector is too silly.
> An ioctl to change the blocksize is more reasonable.

That may be better, I don't know. That's why this is an RFC. Are there any
possible races with that method? It seems to me that you might adversely
affect io in progress by changing the blocksize. The method demonstrated
in this patch shouldn't do that.

> And I expect that this fixed blocksize will go soon.

That may be, I don't know that much about the block layer. All I know is
that, with the current structure, I cannot read or write to sectors where
(sector #) > total-disk-blocks - (total-disk-blocks /
(softblocksize/hardblocksize))

This ioctl can be deprecated when that is no longer the case.

>
> Andries
>

Thanks for the comments.

> [Sorry if precisely the same discussion has happened earlier -
> I have no memory.]
>

Not really. I have discussed this with some folks with Red Hat, but this
is the first discussion on L-K.
--
Michael Brown

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[RFC][PATCH] block ioctl to read/write last sector (New! Improved!)

2001-02-13 Thread Michael E Brown

To address the concerns of Andi Kleen, with a good suggestion from Richard
Johnson, I have revised my previous patch attempt. Please check this out
and comment.

Changelog:
  1) use get_gendisk() instead of walking array manually
  2) pass in struct instead of guessing..
+   struct {
+   unsigned int block;
+   size_t content_length;
+   char *block_contents;
+   } blk_ioctl_parameter;

  3) On write, read in previous contents in case userspace doesn't modify
the whole block. (prevents garbage from being put in the remainder of the
sector)


On 7 Feb 2001, Andi Kleen wrote:
> But what happens when you e.g. run a software blocksize of 4096 and the device
> has >1 inaccessible 512 byte sector at the end?
> I think it would be better to pass in a offset in 512 byte units to a special
> ioctl (and do error checking in the driver for impossible requests)

This has been addressed in the new patch. The new patch takes in an LBA
offset and reads "LastLBA - offset" and returns the result to userspace.
Checks are done so that only normally inaccessable blocks are accessible
through this method. This is done to prevent any possible security
consequences from arising because of this patch.

Test code showing how to use this IOCTL have also been attached.
Regards,
Michael Brown


--Original Message--
Problem Summary:
  There is no function exported to userspace to read or write the last
512-byte sector of an odd-size disk.

  The block device uses 1K blocksize, and will prevent userspace from
seeing the odd-block at the end of the disk, if the disk is odd-size.

  IA-64 architecture defines a new partitioning scheme where there is a
backup of the partition table header in the last sector of the disk. While
we can read and write to this sector in the kernel partition code, we have
no way for userspace to update this partition block.

Solution:
  As an interim solution, I propose the following IOCTLs for the block
device layer: BLKGETLASTSECT and BLKSETLASTSECT. These ioctls will take a
userspace pointer to a char[512] and read/write the last sector. Below is
a patch to do this.

I have attached the patch as well, because I've heard that Pine will eat
patches. :-(







#include 
#include 
#include 
#include 

#include 

#define BLKGETLASTSECT  _IO(0x12,108) /* get last sector of block device */
#define BLKSETLASTSECT  _IO(0x12,109) /* get last sector of block device */
#define BLKSSZGET  _IO(0x12,104)/* get block device sector size */

struct blkdev_ioctl_param {
unsigned int block;
size_t content_length;
char * block_contents;
};

main(int argc, char *argv[])
{
  int filedes, rc=0, blksz=0;
  char *lastblock;
  struct blkdev_ioctl_param ioctl_param;

  ioctl_param.block = 0;

  filedes = open("/dev/sdb", O_RDONLY );
  printf("  open filedes: %d\n", filedes);

  rc = ioctl(filedes, BLKSSZGET, &blksz) ;
  lastblock = (char *) malloc( blksz + 8 );
  ioctl_param.content_length = blksz;
  printf("  block size appears to be %d\n", blksz);

  memset( lastblock, 'z', blksz + 8  );
  ioctl_param.block_contents = lastblock;

  rc = ioctl(filedes, BLKGETLASTSECT, &ioctl_param) ;
  printf("  last sect ioctl rc: %d\n", rc);

  lastblock[blksz + 1] = '\0';
  printf("  Last Block: \n%s<--\n", lastblock);

  return 0;
}




#include 
#include 
#include 
#include 

#include 

#define BLKGETLASTSECT  _IO(0x12,108) /* get last sector of block device */
#define BLKSETLASTSECT  _IO(0x12,109) /* get last sector of block device */
#define BLKSSZGET  _IO(0x12,104)/* get block device sector size */

struct blkdev_ioctl_param {
unsigned int block;
size_t content_length;
char * block_contents;
};

main(int argc, char *argv[])
{
  int filedes, rc=0, blksz=0;
  char *lastblock, newchar = 'a';
  struct blkdev_ioctl_param ioctl_param;
  
  if( argc > 1 ) newchar = argv[1][0];

  ioctl_param.block = 0;

  filedes = open("/dev/sdb", O_RDONLY );
  printf("  open filedes: %d\n", filedes);

  rc = ioctl(filedes, BLKSSZGET, &blksz) ;
  lastblock = (char *) malloc( blksz + 2 );
  ioctl_param.content_length = blksz ;
  printf("  block size appears to be %d\n", blksz );

  memset( lastblock, newchar, blksz + 2 );
  ioctl_param.block_contents = lastblock;

  rc = ioctl(filedes, BLKSETLASTSECT, &ioctl_param) ;
  printf("  last sect ioctl rc: %d\n", rc);

  return 0;
}



diff -ruP linux/drivers/block/blkpg.c linux-ioctl/drivers/block/blkpg.c
--- linux/drivers/block/blkpg.c Fri Oct 27 01:35:47 2000
+++ linux-ioctl/drivers/block/blkpg.c   Tue Feb 13 11:39:37 2001
@@ -39,6 +39,9 @@
 
 #include 
 
+static int set_last_sector( kdev_t dev, const void *param );
+static int get_last_sector( kdev_t dev, const void *param );
+
 /*
  * What is the data describing a partition?
  *
@@ -210,6 +213,16 @@
int intval;
 
switch (cmd) {
+   case

Re: [RFC][PATCH] block ioctl to read/write last sector

2001-02-07 Thread Michael E Brown

On 7 Feb 2001, Andi Kleen wrote:

> But what happens when you e.g. run a software blocksize of 4096 and the device
> has >1 inaccessible 512 byte sector at the end?
> I think it would be better to pass in a offset in 512 byte units to a special
> ioctl (and do error checking in the driver for impossible requests)

This is a valid point.

Can you tell me how it would come about that we would have a blocksize !=
1024?

Can you show the proposed interface to the new ioctl?

I was limited in that I could only figure out how to get one userspace
char* into/out of the ioctl. How would you propose to pass in the offset?
I had problems finding documentation on the more complicated IOCTL calls,
and since I am a kernel hacking novice, I went the easiest and most direct
route.

If you tell me the proposed interface and some sample code, I can code,
test and resubmit it. Thank you for the feedback.

Michael Brown
Linux System Group
Dell Computer Corp

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[RFC][PATCH] block ioctl to read/write last sector

2001-02-06 Thread Michael E Brown


Problem Summary:
  There is no function exported to userspace to read or write the last
512-byte sector of an odd-size disk.

  The block device uses 1K blocksize, and will prevent userspace from
seeing the odd-block at the end of the disk, if the disk is odd-size.

  IA-64 architecture defines a new partitioning scheme where there is a
backup of the partition table header in the last sector of the disk. While
we can read and write to this sector in the kernel partition code, we have
no way for userspace to update this partition block.

Solution:
  As an interim solution, I propose the following IOCTLs for the block
device layer: BLKGETLASTSECT and BLKSETLASTSECT.  These ioctls will take a
userspace pointer to a char[512] and read/write the last sector. Below is
a patch to do this.

I have attached the patch as well, because I've heard that Pine will eat
patches. :-(

--
Michael Brown
Linux System Group
Dell Computer Corp


diff -ruP linux/drivers/block/blkpg.c linux-meb-clean/drivers/block/blkpg.c
--- linux/drivers/block/blkpg.c Fri Oct 27 01:35:47 2000
+++ linux-meb-clean/drivers/block/blkpg.c   Mon Jan 22 10:00:04 2001
@@ -39,6 +39,9 @@

 #include 

+static int set_last_sector( kdev_t dev, char *sect );
+static int get_last_sector( kdev_t dev, char *sect );
+
 /*
  * What is the data describing a partition?
  *
@@ -208,8 +211,19 @@
 int blk_ioctl(kdev_t dev, unsigned int cmd, unsigned long arg)
 {
int intval;
+unsigned long longval;

switch (cmd) {
+   case BLKGETLASTSECT:
+   return get_last_sector(dev, (char *)(arg));
+
+   case BLKSETLASTSECT:
+   if( is_read_only(dev) )
+   return -EACCES;
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+   return set_last_sector(dev, (char *)(arg));
+
case BLKROSET:
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
@@ -281,3 +295,140 @@
 }

 EXPORT_SYMBOL(blk_ioctl);
+
+ /*
+  * get_last_sector()
+  *
+  * Description: This function will return the first 512 bytes of the last sector of
+  *a block device.
+  * Why: Normal read/write calls through the block layer will not read the last sector
+  *of an odd-size disk.
+  * parameters:
+  *dev: a kdev_t that represents the device for which we want the last sector
+  *sect: a userspace pointer, should be at least char[512] to hold the last 
+sector contents
+  * return:
+  *0 on success
+  *   -ERRVAL on error.
+  */
+int get_last_sector( kdev_t dev, char *sect )
+{
+struct buffer_head *bh;
+struct gendisk *g;
+int m, rc = 0;
+unsigned int lba;
+int orig_blksize = BLOCK_SIZE;
+int hardblocksize;
+
+if( !dev ) return -EINVAL;
+
+m = MAJOR(dev);
+for (g = gendisk_head; g; g = g->next)
+if (g->major == m)
+break;
+
+if( !g ) return -EINVAL;
+
+lba = g->part[MINOR(dev)].nr_sects - 1;
+
+if( !lba ) return -EINVAL;
+
+hardblocksize = get_hardblocksize(dev);
+if( ! hardblocksize ) hardblocksize = 512;
+
+ /* Need to change the block size that the block layer uses */
+if (blksize_size[MAJOR(dev)]){
+orig_blksize = blksize_size[MAJOR(dev)][MINOR(dev)];
+}
+if (orig_blksize != hardblocksize)
+   set_blocksize(dev, hardblocksize);
+
+bh =  bread(dev, lba, hardblocksize);
+if (!bh) {
+   /* We hit the end of the disk */
+   printk(KERN_WARNING
+  "get_last_sector ioctl: bread returned NULL.\n");
+   return -1;
+}
+
+rc = copy_to_user(sect, bh->b_data, (bh->b_size > 512) ? 512 : bh->b_size );
+
+brelse(bh);
+
+/* change block size back */
+if (orig_blksize != hardblocksize)
+   set_blocksize(dev, orig_blksize);
+
+return rc;
+}
+
+
+ /*
+  * set_last_sector()
+  *
+  * Description: This function will write the first 512 bytes of the last sector of
+  *a block device.
+  * Why: Normal read/write calls through the block layer will not read the last sector
+  *of an odd-size disk.
+  * parameters:
+  *dev: a kdev_t that represents the device for which we want the last sector
+  *sect: a userspace pointer, should be at least char[512] to hold the last 
+sector contents
+  * return:
+  *0 on success
+  *   -ERRVAL on error.
+  */
+int set_last_sector( kdev_t dev, char *sect )
+{
+struct buffer_head *bh;
+struct gendisk *g;
+int m, rc = 0;
+unsigned int lba;
+int orig_blksize = BLOCK_SIZE;
+int hardblocksize;
+
+if( !dev ) return -