from:"Olivier Galibert"

Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

2018-01-11 Thread Olivier Galibert

Wouldn't the time taken by an easy syscall like getuid be a clear indicator?

  OG.


On Thu, Jan 11, 2018 at 8:17 PM, Dave Hansen
 wrote:
> On 01/11/2018 11:07 AM, Borislav Petkov wrote:
>> On Thu, Jan 11, 2018 at 10:57:51AM -0800, Dave Hansen wrote:
>>> I'd love to have a tool that tells you for sure "KPTI enabled or not",
>>> but I'd also love to have it be something I can easily distribute
>>> without it being handled like a WMD.
>> You mean this:
>>
>> https://git.kernel.org/tip/87590ce6e373d1a5401f6539f0c59ef92dd924a9
>
> I meant that works across all the kpti implementations.  Those with gunk
> in /sys, /proc/cpuinfo, or with nothing at all like the original KAISER
> patches I posted.
>
> And, yes, I want a pony.  I just though someone had such a pony handy
> and it was trivial.

Re: Linux 4.15-rc7

2018-01-11 Thread Olivier Galibert

Wasn't/Isn't the 4G/4G  memory layout for 32 bits essentially KPTI?

  OG.


On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek  wrote:
> Hi!
>
>> The one thing I want to do now that Meltdown and Spectre are public,
>> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
>> particular for really being on top of this.  It's been one huge
>> annoyance, and honestly, Thomas really went over and beyond in this
>> whole mess.  A lot of other people have obviously been involved too,
>
> As I understand it: KPTI prevents Meltdown attack on x86-64, but
> Spectre means even x86-64 is not expected to be safe?
>
> Ok, so Meltdown is public... And I still have some nice 32-bit
> machines I'd like to keep working.
>
> Proof of concept is out, https://github.com/IAIK/meltdown/ .
>
> Is anyone working on KPTI for x86-32? SLES11 should still be
> supported, and that should have x86-32 version; any chance SUSE can
> share some patches?
>
> Thanks,
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: [PATCH] vfs: hard-ban creating files with control characters in the name

2017-10-05 Thread Olivier Galibert

On Tue, Oct 3, 2017 at 5:22 AM, Adam Borowski  wrote:
> Well, what about just \n then?  Unlike all the others which are relatively
> straightforward, \n requires -print0 which not all programs implement, and
> way too many people consider too burdensome to use.

If you don't use -print0, you're vulnerable to spaces.  Go on, try to
disallow spaces in file names, I'll pass the popcorn.

  OG.

Re: [GIT PULL] kdbus for 4.1-rc1

2015-04-21 Thread Olivier Galibert

On Tue, Apr 21, 2015 at 12:31 PM, Greg Kroah-Hartman
 wrote:
> Bringing up SCM_RIGHTS means that this is not going to be a bus system
> at all.  One principal design goal is to _not_ have peer-to-peer
> connections between all communicating parties, but rather one connection
> to a central component.  If that component is not in the kernel, it has
> to be a userspace deamon, which in turn has all of the issues that
> dbus-daemon currently has.

You're not making sense there.  If there is no daemon, then you're
peer-to-peer, because there's no central component.  If you consider
the kernel the central component, then peer-to-peer is almost
impossible by definition.

It seems that almost everybody here thinks that the plumbing (e.g.
transmitting messages in-order with multicasting) should be separated
from the policy (who communicates with who), possibly leveraging the
packet filtering infrastructure to implement the decided policy.  What
it is you reject about that point of view, which seems relatively
normal when you think about building a collection of useful tools?

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ia32_sysenter_target does not preserve EFLAGS

2015-03-28 Thread Olivier Galibert

  Hi,

Beware that could be opening the door to information leaks for a very
small gain (most syscalls are not getuid).

Best,

  OG.


On Sat, Mar 28, 2015 at 1:34 AM, Denys Vlasenko
 wrote:
> On Fri, Mar 27, 2015 at 9:00 PM, Linus Torvalds
>  wrote:
>> On Fri, Mar 27, 2015 at 7:25 AM, Denys Vlasenko  wrote:
>>>
>>> Apparently, users *don't* depend on arithmetic flags
>>> to survive over syscall. They also okay with DF flag
>>> being cleared.
>>
>> Generally, users probably dont' care about many registers at all being
>> saved, but it's worth noting that the reason system calls save/restore
>> even caller-saved registers is at least partly in order to avoid any
>> kernel information leaks.
>>
>> I don't believe that user mode will ever reasonably care about the
>> arithmetic flags being changed, but at the same time I also don't it
>> is something we should ever consider a "feature" we should try to take
>> advantage of. Generally we should try to not mess with the flag state,
>> and I'd *much* rather make the rule be that all the system call return
>> paths restore flags as much as possible.
>
> "We don't clobber anything" ABI has its appeal.
> OTOH, fulfilling ABI's promises has cost which hast to be paid
> on every syscall, regardless whether userspace needed it or not.
>
> Example. This is the uclibc implementation of write():
>
> 004acfc4 <__libc_write>:
>   4acfc4:   53  push   %rbx
>   4acfc5:   48 63 ffmovslq %edi,%rdi
>   4acfc8:   b8 01 00 00 00  mov$0x1,%eax
>   4acfcd:   0f 05   syscall
>   4acfcf:   48 89 c3mov%rax,%rbx
>   4acfd2:   48 81 fb 00 f0 ff ffcmp$0xf000,%rbx
>   4acfd9:   76 0f   jbe4acfea <__libc_write+0x26>
>   4acfdb:   e8 64 15 00 00  callq  4ae544 <__GI___errno_location>
>   4acfe0:   89 da   mov%ebx,%edx
>   4acfe2:   f7 da   neg%edx
>   4acfe4:   89 10   mov%edx,(%rax)
>   4acfe6:   48 83 c8 ff or $0x,%rax
>   4acfea:   5b  pop%rbx
>   4acfeb:   c3  retq
>
> This is a C function. Therefore any its caller assumes that C-clobbered
> registers can be, indeed, clobbered here, so if that caller uses any
> of them, it saves/restores them.
>
> All efforts by kernel code to save/restore C-clobbered registers,
> eight of them, are in vain. It's just useless work. Userspace
> does not benefit from that effort.
>
> If our syscall ABI would say that those regs are not preserved,
> we could have a bit faster syscalls. Any userspace code which
> really had to have those registers preserved across a particular
> syscall, could push/pop them itself.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] tty: serial: 8250 core: provide a function to export uart_8250_port

2014-07-10 Thread Olivier Galibert

On Wed, Jul 9, 2014 at 7:49 PM, Sebastian Andrzej Siewior
 wrote:
> + * The lock assumption made here is none because runtime-pm suspend/resume
> + * callbacks should not be invoked there is any operation performed on the 
> port.

I think there's a missing "if"?

Best,

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ATTEND] How to act on LKML (was: [ 00/19] 3.10.1-stable review)

2013-07-16 Thread Olivier Galibert

On Tue, Jul 16, 2013 at 9:32 AM, David Lang  wrote:
> On Mon, 15 Jul 2013, Sarah Sharp wrote:
>
>> The people who want to work together in a civil manner should get
>> together and create a "Kernel maintainer's code of conduct" that
>> outlines what they expect from fellow kernel developers.  The people who
>> want to continue acting "unprofessionally" should document what
>> behaviors set off their cursing streaks, so that others can avoid that
>> behavior.  Somewhere in the middle is the community behavior all
>> developers can thrive in.
>
>
> By defining your viewpoint as being "professional" and the other viewpoint
> as being "unprofessional" you have already started using very loaded terms
> and greatly reduces the probability of actually getting the other group to
> agree and participate.

Especially since you can very easily translate these terms into
"American" and "non-American".

The stereotypical american professionalism attitude is to be polite at
the word choice level the best to hide a profund disrespect under
them.  There's no meaning taken into account, it's just keyword
spotting.  "Your code is crap" is considered unprofessional, while
"Let's leverage my fifth grade nephew's capabilities to assist you in
fixing the code" is perfectly professional, somehow.  That's more
often than not an unacceptable attitude in europe.

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-14 Thread Olivier Galibert

On Sat, Jul 14, 2012 at 12:33:51AM +0200, Jesper Juhl wrote:
> How about we start cutting down on the options and start saying "a Linux 
> system will provide feature x and y - always ...".
> Stuff like (and I'm just pulling random stuff out here) - ASLR, seccomp, 
> 250HZ minimum etc etc.. We could cut the KConfig options down to 10% of 
> what they are now if we just made a few (hard) choices about some things 
> that would always be there that everyone could count on.  If people want 
> to deviate from the default minimum, sure, let them, but put it under 
> *custom*, *embedded*, *specialized distro*, *you know what you are doing* 
> menu options.

In number of options the "infrastructure" options are at most 20% of
the total.  The other 80% are individual drivers, hardware (like
network cards, serial devices, usb devices, video...) or software
(crypto algorithms, partition formats, codepages, filesystems...).
You're going to have a hard time slashing 90% of that.

  OG.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Olivier Galibert

On Mon, Feb 04, 2008 at 05:57:47PM -0500, Jeff Garzik wrote:
> iSCSI and NBD were passe ideas at birth.  :)
> 
> Networked block devices are attractive because the concepts and 
> implementation are more simple than networked filesystems... but usually 
> you want to run some sort of filesystem on top.  At that point you might 
> as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your 
> networked block device (and associated complexity).

Call me a sysadmin, but I find easier to plug in and keep in place an
ethernet cable than these parallel scsi cables from hell.  Every
server has at least two ethernet ports by default, with rarely any
surprises at the kernel level.  Adding ethernet cards is inexpensive,
and you pretty much never hear of compatibility problems between
cards.

So ethernet as a connection medium is really nice compared to scsi.
Too bad iscsi is demented and ATAoE/NBD inexistant.  Maybe external
SAS will be nice, but I don't see it getting to the level of
universality of ethernet any time soon.  And it won't get the same
amount of user-level compatibility testing in any case.

  OG.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: section breakage on ppc64 (aka __devinitconst is broken by design)

2008-02-03 Thread Olivier Galibert

On Sun, Feb 03, 2008 at 09:02:08PM +, Al Viro wrote:
> On ppc64 relocs => r/w, AFAICS.  On other targets we might have any number
> of other rules.

And -fpic/PIC => (relocs => r/w) because of the DT_TEXTREL crap.  Not
of immediate interest to the kernel though.

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is the kfree() argument const?

2008-01-18 Thread Olivier Galibert

On Fri, Jan 18, 2008 at 05:45:49PM +0100, [EMAIL PROTECTED] wrote:
> The malloc attribute is exactly about this : giving the compiler the
> indication that no other pointer aliases this object, allowing for
> better optimizations.

If you put a malloc attribute on the allocator and no free attribute
on the deallocator, you can get bugs indeed.  GIGO.


> Yes. Bad things start to happen when users add wrong indications to
> the compiler. By adding the "const" indication to kfree(), the programmer
> wrongly tells that it can optimize reading the values pointed to before or
> after calling the function (if it is also sure that they cannot be
> read/written otherwise). Current gcc implementations seem quite
> conservative in this regard, and don't optimize that much, but what about
> the future?

The future should be quite nice because:

- the compiler can not know that kmalloc does not have an alias to
  the pointer tucked somewhere accessible by other non-inline functions
  (as kfree is), especially since it does have aliases in practice, so
  it cannot prove to "not read/written otherwise" part without the
  malloc attribute

- if you add the (non-standard C) malloc attribute to kmalloc, then
  you also add the free attribute to kfree which tells the compiler
  that the pointer is invalid after the call, which ensures no
  accesses will be moved after it

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is the kfree() argument const?

2008-01-18 Thread Olivier Galibert

On Thu, Jan 17, 2008 at 09:02:44PM -0800, David Schwartz wrote:
> 3) It is most useful for 'kfree' to be non-const because destroying an
> object through a const pointer can easily be done in error. One of the
> reasons you provide a const pointer is because you need the function you
> pass the pointer to not to modify the object. Since this is an unusual
> operation that could be an error, it is logical to force the person doing it
> to clearly indicate that he knows the pointer is const and that he knows it
> is right anyway.

Freeing a const pointer is not and has never been unusual.  It happens
all the time for objects whose lifecycle is "initialise at the start,
readonly afterwards", of which names, in particular in the form of
strings, are a large subset.  It also happens in cases of late
deletion on refcounted objects, when the main owner (the one who is
allowed to change the object and has the non-const pointer) has
dropped its reference, but some object needs a readonly instance a
little longer.  Think virtual files in proc, sysfs or friends kept
open after the underlying information source, often a device, is gone.

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is the kfree() argument const?

2008-01-18 Thread Olivier Galibert

On Fri, Jan 18, 2008 at 08:53:44AM -0500, Andy Lutomirski wrote:
> I'd say this implies the exact opposite.  It almost sounds like the 
> compiler is free to change:
> 
> void foo(const int *x);
> foo(x);
> printf("%d", x);
> 
> to:
> 
> void foo(const int *x);
> printf("%d", x);
> foo(x);

That's only if neither function has side effects noticeable by the
other.  Invalidating the pointer in (k)free is rather noticeable.

> (Note that this isn't just a problem for optimizers -- a programmer 
> might expect that passing a pointer to a function that takes a const 
> pointer argument does not, in and of itself, change the pointed-to 
> value.  Given that const certainly does not mean that no one else 
> changes the object, I'm not sure what else it could mean.

Most of the time, const pointer arguments means "I won't change the
contents of the object so that you'll notice by reading it in a normal
way afterwards".  That's pretty much what mutable in a variety of
languages (including C++) is about, saying "this field is internal
management stuff not visible from the external interface, so I need to
be able to change it even through const pointers I got as parameters".
Reference counters for copy-on-write setups is the usual example of
use.

In the case of deallocation functions you are not allowed to do
anything through the pointer or its aliases after the function
returns.  So we're outside of the "most of the time" case, since
you're not allowed to try to notice any change.  Pragmatism takes
over, you want the type that catches as many possible types as
possible while staying reasonable (volatile is never reasonable), and
that's const void *.  As simple as that.

As for releasing resources through const pointers, that happens all
the time as soon as your const use is tight, and if you think forcing
the systematic addition of a (void *) cast is going to make your code
more readable, well, you need more experience in maintaining other
people's applications.

> kfree does not have either property, so I'm don't think it makes
> sense for it to take a const argument.

delete in C++ allows const pointers. Think about it.

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-pm][PATCH] base: Change power/wakeup output from "" to "unsupported" if wakeup feature isn't supported by a device

2008-01-04 Thread Olivier Galibert

On Fri, Jan 04, 2008 at 11:38:29AM -0500, Alan Stern wrote:
> How about changing it to say "unavailable"?  That doesn't imply 
> permanence.

How about not changing a userland-visible interface gratuitously?

  OG.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: INITIO scsi driver fails to work properly

2007-12-17 Thread Olivier Galibert

On Mon, Dec 17, 2007 at 06:08:59PM +0200, Boaz Harrosh wrote:
> Below fixes a deadly typo. Might as well be included in 2.6.24

You're sure ?  scsi_for_each_sg includes a (sg)++ already...


>   scsi_for_each_sg(cmnd, sglist, cblk->sglen, i) {
>   sg->data = cpu_to_le32((u32)sg_dma_address(sglist));
>   total_len += sg->len = 
> cpu_to_le32((u32)sg_dma_len(sglist));
> + ++sg;
>   }

  OG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: can support for "rpm"-based package building just be dropped?

2007-11-27 Thread Olivier Galibert

On Mon, Nov 26, 2007 at 05:17:18PM +0100, Jan Engelhardt wrote:
> rpm -b does not work in opensuse anymore (redirects you to use rpmbuild), and 
> I
> bet fedora will do the same, so if you don't have rpm-build, tough luck for
> make rpm.

The point, if I understand it correctly, was that when rpmbuild is not
install and rpm is not building-capable not to fall back to rpm and
instead yell loudly that rpmbuild is missing.

To which some said that right now "do not fall back, ever, just
require rpmbuild" may just be the best.

I have no opinion though.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Olivier Galibert

Original thread btw:
  http://www.ussg.indiana.edu/hypermail/linux/kernel/9907.0/0132.html

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Olivier Galibert

On Thu, Nov 22, 2007 at 07:17:14PM +0100, Jan Kara wrote:
>   Hi,
> 
>   I guess subject says it all - why is FIBMAP ioctl restricted only to
> root (CAP_SYS_RAWIO)? Corresponding ioctl for XFS is allowed without any
> special capabilities so we are inconsistent here too...
>   Would anyone mind if the check is removed?

Once upon a time some filesystems fucked up when incorrect values
(negative offsets in particular).  So the easy way out was taken and
FIBMAP was restricted, to the eternal annoyance of DVD players which
needed the sector number for CSS reasons.  Since then dvd players have
included an udf parser and life went on.

Well, psx movie players needed it too, but bah.

Essentially if you remove the restriction you have to audit all
filesystems to be sure that they're not going to be problematic.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

The /proc/acpi/video/*/DOS default change broke my system

2007-11-19 Thread Olivier Galibert

T'was done as a21101c46ca5b4320e31408853cdcbf7cb1ce4ed.  The system is
a latitude x300 (i855GM).  With '1', the old default, I can close and
re-open the lid and have nothing happening.  With '0' the screen turns
black with the mouse cursor left frozen on top of it and the computer
crashed.

Closing the lid also raises a "video bus notify".

How can I go about debugging that?

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [BUG] New Kernel Bugs

2007-11-15 Thread Olivier Galibert

On Thu, Nov 15, 2007 at 06:59:34AM +0100, Rene Herman wrote:
> Totally unrelated indeed so why are spouting crap? If the kohab list has a 
> problem take it up with them but keep ALSA out of it. alsa-devel has only 
> ever moderated out spam -- nothing else.

That is incorrect.  Hopefully it is the case now though, since my
experience of the subject was years ago.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/10] Change table chaining layout

2007-10-24 Thread Olivier Galibert

On Wed, Oct 24, 2007 at 03:38:04PM +0200, Jens Axboe wrote:
> (please don't drop cc lists)

Sorry.  Reactions of people to Cc vary...

> That doesn't make any sense. Both sg_set_buf() and sg_set_page() set the
> same thing in the sg entry, the input is just different. It has nothing
> to do with setting the physical value, for instance.

Ok.  I misunderstood the sg_virt/sg_phys difference I guess.  No
problem.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/10] Change table chaining layout

2007-10-24 Thread Olivier Galibert

On Wed, Oct 24, 2007 at 11:12:42AM +0200, Jens Axboe wrote:
> sg_set_buf() also sets length and offset, sg_set_page() is just a mirror
> of that. So I'd prefer to keep the naming.

Hmmm, sg_set_phys/sg_set_virt to be more symmetrical to
sg_phys/sg_virt?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Point of gpl-only modules (flame)

2007-10-02 Thread Olivier Galibert

On Tue, Oct 02, 2007 at 11:49:04PM +0200, Jimmy wrote:
> Also, how about a list of PROS, explain to me whats so cool about it?

People who do binary-only drivers have a much better chance of not
doing a derivative work when they only use non-EXPORT_GPL exports, and
as a result not being in the wrong legally.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Version 3 (2.6.23-rc8) Smack: Simplified Mandatory Access Control Kernel

2007-10-01 Thread Olivier Galibert

On Mon, Oct 01, 2007 at 09:04:44AM -0700, Linus Torvalds wrote:
> For example, you security guys still debate "inodes" vs "pathnames", as if 
> that was an either-or issue.
> 
> Quite frankly, I'm not a security person, but I can tell a bad argument 
> from a good one. And an argument that says "inodes _or_ pathnames" is so 
> full of shit that it's not even funny. And a person who says that it has 
> to be one or the other is incompetent.

Not so much incompetent as religious fundamentalist.  Which is worse.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Chroot bug

2007-09-26 Thread Olivier Galibert

On Wed, Sep 26, 2007 at 08:43:44PM +0930, David Newall wrote:
> Olivier Galibert wrote:
> >chroot does not allow you to walk out if you're in.
> 
> You're mistaken.  Or more properly, further use of chroot lets you walk 
> out.  This really has been said before, and before, and before.
> 
>chroot("subtree");   // enter chroot
>chdir("/");// now at subtree
>chroot("/tmp");   // now outside of chroot

Of course.  chroots are not a stack, they're just a point in the
namespace.  You change it, the conditions apply to the new one.

> BSD redefined chroot so that the working directory is set to the new 
> root on subsequent uses of chroot; that's how they solved the bug.

They didn't solve a thing.  fchdir baby.  Unless you want to remove
fchdir.  And mknod.  And mount.  And so many other different syscalls
that I don't even know the list.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Chroot bug

2007-09-26 Thread Olivier Galibert

On Wed, Sep 26, 2007 at 07:57:38PM +0930, David Newall wrote:
> As has been said, there are thousands of ways to break out of a chroot.  
> It's just that one of them should not be that chroot lets you walk out.  

chroot does not allow you to walk out if you're in.  It only allows
you to walk outside if you're *already* out.  That's the way it is
defined.  Those who want some kind of chroot for security reasons
should look at (BSD's ?) jail, and/or hypervisors.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: false positive in checkpatch.pl (complex macro values)

2007-08-24 Thread Olivier Galibert

On Fri, Aug 24, 2007 at 05:43:47AM -0700, SL Baur wrote:
> Who uses code like this, by the way?

People who think Posix is an example to follow maybe?  Not sure if it
would go past the maintainers though :-)

# define PTHREAD_MUTEX_INITIALIZER \
  { { 0, 0, 0, 0, 0, { 0 } } }
# ifdef __USE_GNU
#  define PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP \
  { { 0, 0, 0, PTHREAD_MUTEX_RECURSIVE_NP, 0, { 0 } } }
#  define PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP \
  { { 0, 0, 0, PTHREAD_MUTEX_ERRORCHECK_NP, 0, { 0 } } }
#  define PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP \
  { { 0, 0, 0, PTHREAD_MUTEX_ADAPTIVE_NP, 0, { 0 } } }
# endif

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is it time for remove (crap) ALSA from kernel tree ?

2007-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2007 at 02:58:02PM +0200, Takashi Iwai wrote:
> Hm...  I don't agree much with the virtual relay device solution.
> I once experimentally implemented an ALSA-OSS virtual kernel driver.
> But, it just gives more complexity.

So instead you move the complexity in the library where it is worse.


> Yes, the library solution has merits and demerits.  The library should
> have been differently designed.  But, I don't think the virtual relay
> is the best solution just because you can use a bare kernel
> interface...

Whatever you do in the library won't solve the problem of properly
supporting the OSS interface.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is it time for remove (crap) ALSA from kernel tree ?

2007-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2007 at 02:40:23PM +0200, Jan Engelhardt wrote:
> 
> On Jun 25 2007 14:31, Takashi Iwai wrote:
> >> It was started in time when most cheap sound cards was without hw mixer.
> >> And .. when today you use ALSA on sound card without hw mixer still all 
> >> this (past ?) problems are actual.
> >
> >Huh?  I have no problems with soft mixing...
> 
> Diverging from the discussion, how is soft mixing actually done? If it was 
> done
> in userspace, it would need shared memory, or a back relay from kernelspace to
> userspace (and back again for the final output), otherwise I could not imagine
> how all alsa streams came together at one point.

SysV shared memory and semaphores, done in the alsa lib.

Yes, your kernel sound access library does shared mem, semaphores,
fork+exec and friends.

Back relay and virtual devices is the way it should have been done.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is it time for remove (crap) ALSA from kernel tree ?

2007-06-25 Thread Olivier Galibert

On Mon, Jun 25, 2007 at 02:31:08PM +0200, Takashi Iwai wrote:
> So, do you mean the soft-mixing is the biggest issue?  That's just a
> part of a design issue, and if we want to go to that way, the
> impelemtation would be trivial, regardless on ALSA or not.  Totally 
> irrelevant argument regarding "remove ALSA".

Soft mixing is actually the biggest issue because if you had
generalized soft-mixing in the kernel-visible audio ports[1] you would
win two things:

- programs could use the OSS API without interfering with the ALSA one
  or which each other

- programs coult use the ALSA kernel API directly without interfering
  either, which would allow alternative libalsa implementations for
  those who hate the current one

Frankly, mandatory libraries are extremely annoying, and mandatory
extremely complex overdesigned libraries are simply unbearable.

  OG.

[1] Which does *not* mean doing the mixing in the kernel.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Is it time for remove (crap) ALSA from kernel tree ?

2007-06-24 Thread Olivier Galibert

On Sun, Jun 24, 2007 at 09:57:24PM +0100, Alan Cox wrote:
> > Sory Alan but I don't want philosophical/historical discuss.
> > Try to answer on question "ALSA or OSS ?" using *only* technical arguments.
> 
> We dropped OSS for ALSA for technical reasons. Those being that ALSA
> - has a better audio API

You mean the undocumented, 100% ioctl one?  With one ioctl to write
interleaved sound, one for non-interleaved sound, in addition to
setting interleaved or not in the configuration?  I should check one
day which one wins.

Or the "library"?  Don't get me started on this one.

I take your word about the fact that the kernel side is better.  The
userland side, not so much.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-14 Thread Olivier Galibert

On Thu, Jun 14, 2007 at 09:20:35PM -0400, Rob Landley wrote:
> Why do you keep saying "upgraded" to GPLv3?  How is it an improvement to move 
> from a small, simple, elegant, and tested implementation to something that's 
> more complicated, less elegant, less coherent, totally untested, and full of 
> numerous special cases?

Ahhh, but so much more entreprisy.  I never had realized before that
the DailyWTF applied to licenses too.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] 1/2: MMCONFIG: validate against ACPI motherboard resources

2007-06-04 Thread Olivier Galibert

On Tue, May 29, 2007 at 10:03:32PM -0600, Robert Hancock wrote:
> -Validate that the area is reserved even if we read it from the
> chipset directly and not from the MCFG table. This catches the case
> where the BIOS didn't set the location properly in the chipset and
> has mapped it over other things it shouldn't have.

Just for the record, I still fundamentally disagree with that part.
You're not catching what you think you're catching, since the chipset
tells you what it is going to decode as mmconfig, no matter what is
connected to it.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-05-23 Thread Olivier Galibert

On Wed, May 23, 2007 at 02:20:23PM -0700, Jesse Barnes wrote:
> On Wednesday, May 23, 2007 1:56 pm Linus Torvalds wrote:
> > Ehh. Even for PCIe, why not use the normal accesses for the first 256
> > bytes? Problem solved.
> 
> Ok, this patch also works.  We still need to enable mmconfig space for 
> PCIe and extended config space, but we can continue to use type 1 
> accesses for legacy PCI config space cycles to avoid decode trouble 
> with mmconfig based BAR sizing.

Isn't that a mac-intel instant killer?  AFAIK they don't have type1,
period.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmconfig: Some additional chipset register values validation.

2007-05-02 Thread Olivier Galibert

On Wed, May 02, 2007 at 11:52:36AM +0200, Andi Kleen wrote:
> On Wednesday 02 May 2007 02:50:11 Olivier Galibert wrote:
> > On i945, a mmconfig range hitting the f000- zone conflicts
> > with the APIC registers and others.  Consider it invalid.
> > 
> > On E7520, values  and f000 for the window register are defined
> > invalid in the documentation.
> 
> Added thanks

Oh, feel free to add the:
Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>

I forgot in the original mail.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mmconfig: Some additional chipset register values validation.

2007-05-01 Thread Olivier Galibert

On i945, a mmconfig range hitting the f000- zone conflicts
with the APIC registers and others.  Consider it invalid.

On E7520, values  and f000 for the window register are defined
invalid in the documentation.

---

I haven't seen a bios use these values, but who trusts biosen these
days?


 arch/i386/pci/mmconfig-shared.c |   25 +
 1 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 747d8c6..c7cabee 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -60,14 +60,19 @@ static const char __init *pci_mmcfg_e7520(void)
u32 win;
pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
-   pci_mmcfg_config_num = 1;
-   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), GFP_KERNEL);
-   if (!pci_mmcfg_config)
-   return NULL;
-   pci_mmcfg_config[0].address = (win & 0xf000) << 16;
-   pci_mmcfg_config[0].pci_segment = 0;
-   pci_mmcfg_config[0].start_bus_number = 0;
-   pci_mmcfg_config[0].end_bus_number = 255;
+   win = win & 0xf000;
+   if(win == 0x || win == 0xf000)
+   pci_mmcfg_config_num = 0;
+   else {
+   pci_mmcfg_config_num = 1;
+   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), 
GFP_KERNEL);
+   if (!pci_mmcfg_config)
+   return NULL;
+   pci_mmcfg_config[0].address = win << 16;
+   pci_mmcfg_config[0].pci_segment = 0;
+   pci_mmcfg_config[0].start_bus_number = 0;
+   pci_mmcfg_config[0].end_bus_number = 255;
+   }
 
return "Intel Corporation E7520 Memory Controller Hub";
 }
@@ -108,6 +113,10 @@ static const char __init *pci_mmcfg_intel_945(void)
if ((pciexbar & mask) & 0x0fffU)
pci_mmcfg_config_num = 0;
 
+   /* Don't hit the APIC registers and their friends */
+   if ((pciexbar & mask) >= 0xf000U)
+   pci_mmcfg_config_num = 0;
+
if (pci_mmcfg_config_num) {
pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), 
GFP_KERNEL);
if (!pci_mmcfg_config)
-- 
1.5.1.81.gee969

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] PCI MMCONFIG: add validation against ACPI motherboard resources

2007-04-30 Thread Olivier Galibert

On Sun, Apr 29, 2007 at 08:14:37PM -0600, Robert Hancock wrote:
> -Validate that the area is reserved even if we read it from the
> chipset directly and not from the MCFG table. This catches the case
> where the BIOS didn't set the location properly in the chipset and
> has mapped it over other things it shouldn't have.  This might be
> overly pessimistic - we might be able to instead verify that no
> other reserved resources (like chipset registers) are inside this
> memory range.

I have a fundamental problem with that: you don't validate a higher
reliability information against a lower one.  The chipset registers
are high reliability.  Modulo unknown hardware erratas and bugs in the
code (and accepting f000 is in practice a bug in the code, the
docs are starting to catch up with it too), the chipset *will* decode
mmconfig at the looked up address no matter what.  On the other side,
the ACPI data is bios generated, and that is well known to be horribly
unreliable.  Hell, if it was reliable we could just use the MFCG ACPI
table without questions.

So you can check the ACPI stuff for coherency (MFCG vs. the rest), you
can validate the ACPI stuff against the results of the lookup if you
want, but validating the lookup against ACPI is nonsensical.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Back to the future.

2007-04-26 Thread Olivier Galibert

On Thu, Apr 26, 2007 at 03:49:51PM -0700, David Lang wrote:
> swap partitions are limited to 2G (or at least they were a couple of months 
> ago when I last checked). I also don't want to run the risk of having a box 
> try to _use_ 16G worth of swap. I'd rather have the box hit OOM first.

They aren't limited anymore, I have a number of machines with 20G swap
for experiments.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Back to the future.

2007-04-26 Thread Olivier Galibert

On Fri, Apr 27, 2007 at 06:50:56AM +1000, Nigel Cunningham wrote:
> I'm perfectly willing to think through some alternate approach if you
> suggest something or prod my thinking in a new direction, but I'm afraid
> I just can't see right now how we can achieve what you're after.

Ok, what about this approach I've been mulling about for a while:

Suspend-to-disk is pretty much an exercise in state saving.  There are
multiple ways to do state saving, but they tend to end up in two
categories: implicit and explicit.

In implicit state saving, you try to save the state of the
system/application/whatever "under its feet", more or less, and then
fixup what is no saved/saveable correctly.  A well-known example is
the undumping process Emacs goes (went?) where it tries to dump the
state of the memory as a new executable, with a lot of pleasure with
various executable formats and subtleties due to side effects in libc
code you don't control.

In explicit state saving each object saves what is needed from its
state to an independently defined format (instead of "whatever the
memory organization happens to be at that point").  When reloading the
state you have to parse it, and it usually requires
rebuilding/relocating all references/pointers/etc.  XEmacs currently
has a "portable dumper" that pretty much does just that.  We don't
have any redumping problems anymore, they're over.

Which one is the best depends heavily on the application.  The amount
of code in the implicit case depends on the amount of fixups to do.
In the kernel case it happens to be a lot, pretty much everything that
touches hardware has to save to memory the device state and reload it
on resume.  And bugs on hardware handling can be quite annoying to
debug.  And if some driver does not to saving/resume correctly, you
have no way outside of playing with modules to ensure the safety of
the suspend cycle.

The amount of code in the explicit case is an interesting variable in
the case of the kernel.  You have to save what is needed, but how do
you define what is needed?  It is, pretty much, what running processes
can observe from userspace.  Now, what can a process observe:
- its application text and anonymous memory pages
- its file handles
- its mapped files
- its mapped whatever else
- its sys5 IPC stuff
- futex stuff and friends, namespaces, etc
- its intrinsic characteristics it can reach through syscalls
  (i.e. the user-visible parts of current, like pid, uid...)
- its currently running system call, if any

So that's what we'd have to explicitely save.  Anonymous memory, sys5
IPC, futex and current structures, that's easy stuff in practice.  The
fun part are pretty much:
- references to files
- references to active networking links
- references to devices and associated visible state
- currently running system call, aka the kernel stack for the process

The last one is the one I'm the most afraid of.  I hope that the
signal stuff and/or the asynchronous syscall stuff that was discussed
recently would allow to "unwind" blocking system calls back to the
syscall level and then store the parameters for resume-time restart.
The non-blocking calls you can just let finish.

The first one is really interesting.  If you value your filesystems,
you'd rather have them clean after the suspend.  And also you pretty
much know that filesystems can move around when you're not looking, be
it USB hotplug stuff (discovery order is random-ish isn't it?), module
loading order issues or multithreaded device discovery.  So you're way
more happy *not* caching anything from the filesystem you can avoid.

But what is a file reference, really?  With the dcache handy, it's
pretty much a path, since inodes don't always exist reliably.  And if
you have the lists of paths used by the processes on a particular
filesystem, you can easily get an idea of where, if anywhere, the
filesystem is even if you don't have reliable serials.  More
interestingly, you cannot, in any case, instantly corrupt your
filesystem by having a mismatch between the in-memory cache and the
reality.

The processes which referenced files you can't find anywhere will
end-up with EBADF or segfault depending on whether it was fd or mmap,
ala revoke().  They'll probably die horribly.  I'd rather have
processes die than filesystems die, since in any case if the file
isn't here anymore in practice the process could only destroy things.

An interesting things there, nothing in that touches either the
filesystem or the block devices.  Everything is done at the VFS level.
The devices don't need to care.  And the "this filesystem goes there"
can be done in userspace in an initramfs if people want to experiment
with kinky strategies.  After all, why not allow a sysadmin to regroup
two filesystems into one though a suspend, the processes mostly don't
need to care (well, tar may, but heh).  Deleted files would have to be
sillyrenamed or something.  Implementation details ;-)

Active networking links, you can consider

Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-26 Thread Olivier Galibert

On Thu, Apr 26, 2007 at 01:09:53PM +0200, Pavel Machek wrote:
> #define SNAPSHOT_SET_IMAGE_SIZE   _IOW(SNAPSHOT_IOC_MAGIC, 6, 
> unsigned long)

So I'm not supposed to be able to suspend the 16Gb-ram, 32bits servers
I have here?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-25 Thread Olivier Galibert

On Wed, Apr 25, 2007 at 11:50:45AM -0700, Linus Torvalds wrote:
> .. but if the alternative is a feature that just isn't worth it, and 
> likely to not only have its own bugs, but cause bugs elsewhere? (And yes, 
> I believe STD is both of those. There's a reason it's called "STD". Go 
> to google and type "STD" and press "I'm feeling lucky". Google is God).

If it was correctly designed, it would be possible to change the
hardware or even the kernel through a STD cycle.  And that would be
damn interesting on servers.

In any case, if I could trust it, I'd use it when I need to move
servers around and I don't want to lose what is running.  Riding power
cuts that way would be nice.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/7] libata: check for AN support

2007-04-25 Thread Olivier Galibert

On Wed, Apr 25, 2007 at 08:16:51PM +0100, Matt Sealey wrote:
> > +#define ata_id_has_AN(id)  \
> > +   ( (((id)[76] != 0x) && ((id)[76] != 0x)) && \
> > + ((id)[78] & (1 << 5)) )
> 
> ??
> 
> > --- 2.6-git.orig/include/linux/libata.h
> > +++ 2.6-git/include/linux/libata.h
> > @@ -136,6 +136,7 @@ enum {
> > ATA_DFLAG_CDB_INTR  = (1 << 2), /* device asserts INTRQ when ready 
> > for CDB */
> > ATA_DFLAG_NCQ   = (1 << 3), /* device supports NCQ */
> > ATA_DFLAG_FLUSH_EXT = (1 << 4), /* do FLUSH_EXT instead of FLUSH */
> > +   ATA_DFLAG_AN= (1 << 5), /* device supports Async 
> > notification */
> > ATA_DFLAG_CFG_MASK  = (1 << 8) - 1,
> 
> Why don't the macros use the enums? It makes the code hard to read without
> painful cross-reference doesn't it? Surely (id)[76] & (ATA_DFLAG_AN) is a
> lot more readable than 1 << 5 - even if the flag is obviously that, a lot
> of values and registers can have 1 << 5 as a flag and mean a lot of different
> things.

The two being 32 is just a coincidence.  One is a hardware register
bit, the other the signification of the bits of ata_device->flags.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-24 Thread Olivier Galibert

On Tue, Apr 24, 2007 at 04:41:58PM -0700, Linus Torvalds wrote:
> How many different magic ioctl's does the thing introduce? Is it really 
> just *two* entry-points (and how simple are they, interface-wise), and 
> nothing else?

Aren't you a little late to the party here?  The userland version is
the one that currently is in the kernel, after all the people who said
"doing it in userland is not necessarily a good idea" got happily
ignored.  Suspend2 which is the continuity of the fully-in-kernel one
is the one that has been constantly rejected by Pavel, lately by
saying "it should be done in userspace", and hence never merged.

Incidentally, it's 13 ioctls, and it's documented in
Documentation/power/userland-swsusp.txt in a hard drive near you.  I
especially like the "get the available swap space in bytes" one that
can only handle 32 bits.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert

On Tue, Apr 24, 2007 at 01:53:27PM -0700, Kristen Carlson Accardi wrote:
> Check to see if an ATAPI device supports Asynchronous Notification.
> If so, enable it.
> 
> changes from last version: 
> * fix typo in ata_id_has_AN and make word 76 test more clear
> * If we fail to set the AN feature, just print a warning and continue
>  
> Signed-off-by: Kristen Carlson Accardi <[EMAIL PROTECTED]>
> 
> @@ -299,6 +305,8 @@ struct ata_taskfile {
>  #define ata_id_queue_depth(id)   (((id)[75] & 0x1f) + 1)
>  #define ata_id_removeable(id)((id)[0] & (1 << 7))
>  #define ata_id_has_dword_io(id)  ((id)[50] & (1 << 0))
> +#define ata_id_has_AN(id)\
> + (((id[76] != 0x) && (id[76] != 0x)) && ((id)[78] & (1 << 5)))

(id)[76] I guess ?  Sorry for being a pain :/

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert

On Tue, Apr 24, 2007 at 08:49:04AM -0700, Kristen Carlson Accardi wrote:
> On Tue, 24 Apr 2007 12:23:04 +0200
> Olivier Galibert <[EMAIL PROTECTED]> wrote:
> 
> > Sorry for replying to Alan's reply, I missed the original mail.
> > 
> > > > +#define ata_id_has_AN(id)  \
> > > > +   ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))
> > 
> > (a && ~a) & (b & 32)
> > 
> > I don't think that does what you think it does, because at that point
> > it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).
> > 
> > I'm not even sure what it is you want.  If for the first part you
> > wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
> > thanks :-)
> > 
> >   OG.
> > 
> 
> >From the serial ata spec, we have:
> 
> 13.2.1.18Word 78: Serial ATA features supported
> If Word 76 is not h or h, Word 78 reports the optional features 
> supported by the device.  Support for this word is optional and if not 
> supported the word shall be zero indicating the device has no support for new 
> Serial ATA capabilities.
> 
> so, basically yes, I'm really testing to make sure that word 76 isn't 0 or all
> one then using that value & with value of bit in work 78 to determine AN
> support - if you think this is really obfuscated, I've got no problem 
> changing 
> it - there's obviously many ways to mess around with bits.

& is not &&, so right now it's really incorrect.  1 & 32 is 0.

((id)[76] != 0x && (id)[76] != 0x && ((id)[78] & (1 << 5)))

The implicit typing of id looks dangerous to me, but you're not the
one who has started it.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/7] libata: check for AN support

2007-04-24 Thread Olivier Galibert

Sorry for replying to Alan's reply, I missed the original mail.

> > +#define ata_id_has_AN(id)  \
> > +   ((id[76] && (~id[76])) & ((id)[78] & (1 << 5)))

(a && ~a) & (b & 32)

I don't think that does what you think it does, because at that point
it's a funny way to write 0 ((0 or 1) binary-and (0 or 32)).

I'm not even sure what it is you want.  If for the first part you
wanted (id[76] != 0x00 && id[76] != 0xff), please write just that,
thanks :-)

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Stop pmac_zilog from abusing 8250's device numbers.

2007-04-05 Thread Olivier Galibert

On Wed, Apr 04, 2007 at 07:15:32PM +0100, Russell King wrote:
> *However* you still run into the issue that you do not know how many
> serial ports you will need to register a tty driver with the tty layer.
> Solve that technical problem and the idea of having a single namespace
> for chosen serial ports and 8250 ports suddenly becomes realistic.

Ok, so that I understand correctly, your problem is with the
tty_register_driver interface as used in
serial_core:uart_register_driver, correct?

Looking at the function, I understand why.
{alloc,register}_chrdev_region is very, very not designed to be
fully dynamic it seems.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Stop pmac_zilog from abusing 8250's device numbers.

2007-04-04 Thread Olivier Galibert

On Wed, Apr 04, 2007 at 12:14:34PM +0100, Alan Cox wrote:
> > If you want hierarchy, create it:
> > 
> > /sys/blah/serial/controllerX/portY
> > 
> > and keeping them all under the ttyS? major keeps the simple
> > cases working sanely too.
> 
> Currently yes you could do that, but that would break all the back
> compatibility.

libata's hd->s* does that too, with probably a much larger impact.
Could udev be useful for once and make back compatibility symlinks?
Unifying serial the same way disks, cdroms, network, etc got unified
has some charm.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: max_loop limit

2007-03-22 Thread Olivier Galibert

On Thu, Mar 22, 2007 at 02:33:14PM +, Al Viro wrote:
> Correction: current ABI is crap.  To set the thing up you need to open
> it and issue an ioctl.  Which is a bloody bad idea, for obvious reasons...

Agreed.  What would be a right way?  Global device ala ptmx/tun/tap?
New syscall?  Something else?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 00/11] ANNOUNCE: "Syslets", generic asynchronous system call support

2007-02-13 Thread Olivier Galibert

On Tue, Feb 13, 2007 at 10:57:24PM +0100, Ingo Molnar wrote:
> 
> * Davide Libenzi  wrote:
> 
> > > Open issues:
> 
> > If this is going to be a generic AIO subsystem:
> > 
> > - Cancellation of pending request
> 
> How about implementing aio_cancel() as a NOP. Can anyone prove that the 
> kernel didnt actually attempt to cancel that IO? [but unfortunately 
> failed at doing so, because the platters were being written already.]
> 
> really, what's the point behind aio_cancel()?

Lemme give you a real-world scenario: Question Answering in a Dialog
System.  Your locked-in-memory index ranks documents in a several
million files corpus depending of the chances they have to have what
you're looking for.  You have a tenth of a second to read as many of
them as you can, and each seek is 5ms.  So you aio-read them,
requesting them in order of ranking up to 200 or so, and see what you
have at the 0.1s deadline.  If you're lucky, a combination of cache
(especially if you stat() the whole dir tree on a regular basis to
keep the metadata fresh in cache) and of good io reorganisation by the
scheduler will allow you to get a good number of them and do the
information extraction, scoring and clustering of answers, which is
pure CPU at that point.  You *have* to cancel the remaining i/o
because you do not want the disk saturated when the next request
comes, especially if it's 10ms later because the dialog manager found
out it needed a complementary request.

Incidentally, that's something I'm currently implementing for work,
making these aio discussions more interesting that usual :-)

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: somebody dropped a (warning) bomb

2007-02-13 Thread Olivier Galibert

On Tue, Feb 13, 2007 at 09:06:24PM +0300, Sergei Organov wrote:
> I agree that making strxxx() family special is not a good idea. So what
> do we do for a random foo(char*) called with an 'unsigned char*'
> argument? Silence? Hmmm... It's not immediately obvious that it's indeed
> harmless. Yet another -Wxxx option to GCC to silence this particular
> case?

Silence would be good.  "char *" has a special status in C, it can be:
- pointer to a char/to an array of chars (standard interpretation)
- pointer to a string
- generic pointer to memory you can read(/write)

Check the aliasing rules if you don't believe be on the third one.
And it's *way* more often cases 2 and 3 than 1 for the simple reason
that the signedness of char is unpredictable.  As a result, a
signedness warning between char * and (un)signed char * is 99.99% of
the time stupid.

> May I suggest another definition for a warning being entirely sucks?
> "The warning is entirely sucks if and only if it never has true
> positives." In all other cases it's only more or less sucks, IMHO.

That means a warning that triggers on every line saying "there may be
a bug there" does not entirely suck?

> I'm afraid I don't follow. Do we have a way to say "I want an int of
> indeterminate sign" in C?

Almost completely.  The rules on aliasing say you can convert pointer
between signed and unsigned variants and the accesses will be
unsurprising.  The only problem is that the implicit conversion of
incompatible pointer parameters to a function looks impossible in the
draft I have.  Probably has been corrected in the final version.

In any case, having for instance unsigned int * in a prototype really
means in the language "I want a pointer to integers, and I'm probably
going to use it them as unsigned, so beware".  For the special case of
char, since the beware version would require a signed or unsigned tag,
it really means indeterminate.

C is sometimes called a high-level assembler for a reason :-)

> The same way there doesn't seem to be a way
> to say "I want a char of indeterminate sign". :( So no, strlen() doesn't
> actually say that, no matter if we like it or not. It actually says "I
> want a char with implementation-defined sign".

In this day and age it means "I want a 0-terminated string".
Everything else is explicitely signed char * or unsigned char *, often
through typedefs in the signed case.

> In fact it's implementation-defined, and this may make a difference
> here. strlen(), being part of C library, could be specifically
> implemented for given architecture, and as architecture is free to
> define the sign of "char", strlen() could in theory rely on particular
> sign of "char" as defined for given architecture. [Not that I think that
> any strlen() implementation actually depends on sign.]

That would require pointers tagged in a way or another, you can't
distinguish between pointers to differently-signed versions of the
same integer type otherwise (they're required to have same size and
alignment).  You don't have that on modern architectures.

> Can we assure that no function taking 'char*' ever cares about the sign?
> I'm not sure, and I'm not a language lawyer, but if it's indeed the
> case, I'd probably agree that it might be a good idea for GCC to extend
> the C language so that function argument declared "char*" means either
> of "char*", "signed char*", or "unsigned char*" even though there is no
> precedent in the language.

It's a warning you're talking about.  That means it is _legal_ in the
language (even if maybe implementation defined, but legal still).
Otherwise it would be an error.

> BTW, the next logical step would be for "int*" argument to stop meaning
> "signed int*" and become any of "int*", "signed int*" or "unsigned
> int*". Isn't it cool to be able to declare a function that won't produce
> warning no matter what int is passed to it? ;)

No, it wouldn't be logical, because char is *special*.

> Yes, indeed. So the real problem of the C language is inconsistency
> between strxxx() and isxxx() families of functions? If so, what is 
> wrong with actually fixing the problem, say, by using wrappers over
> isxxx()? Checking... The kernel already uses isxxx() that are macros
> that do conversion to "unsigned char" themselves, and a few invocations
> of isspace() I've checked pass "char" as argument. So that's not a real
> problem for the kernel, right?

Because a cast to silence a warning silences every possible warning
even if the then-pointer turns for instance into an integer through an
unrelated change.  Think for instance about an error_t going from
const char * (error string) to int (error code) through a patch, which
happened to be passed to an utf8_to_whatever conversion function that
takes an const unsigned char * as a parameter.  Casting would hide the
impact of changing the type.

> As the isxxx() family does not seem to be a real problem, at least in
> the context of the kernel source base, I'd like to l

Re: [PATCH 1/2] Re: [autofs] Bad race condition in the new autofs protocol somewhere

2007-02-13 Thread Olivier Galibert

On Wed, Feb 14, 2007 at 02:35:15AM +0900, Ian Kent wrote:
> On Tue, 2007-02-13 at 16:54 +0100, Olivier Galibert wrote:
> > Don't they require autofs5 to be of any use though?  That's not going
> > to be in fc until it's out of beta I guess.
> 
> Not really?
> 
> [EMAIL PROTECTED] ~]$ cat /etc/redhat-release 
> Fedora Core release 6 (Zod)
> [EMAIL PROTECTED] ~]$ rpm -q autofs
> autofs-5.0.1-0.rc3.16

Oh cool, my guess was wrong.


> The Rawhide or FC-6 srpm should build fine on FC5.

The .spec that comes with the tgz works without a hitch too.  I guess
fc6 is going to have a *really* working automounter RSN then.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Re: [autofs] Bad race condition in the new autofs protocol somewhere

2007-02-13 Thread Olivier Galibert

On Tue, Feb 13, 2007 at 09:07:27AM -0500, Chuck Ebbert wrote:
> Olivier Galibert wrote:
> > On Tue, Feb 13, 2007 at 09:52:39AM +0900, Ian Kent wrote:
> >> Indeed.
> >> Which kernel can you use?
> >> I believe that 2200 had another problem so can you use an fc5 kernel
> >> later than that?
> > 
> > I've ported your patch to 2257 (nothing special, only moved lines),
> > and it seems to work beautifully.  I'm enlarging the testing.
> > 
> 
> If you get the patches into -stable they will end up in Fedora
> kernels automatically. 2288 (based on 2.6.19) is in testing now...

Don't they require autofs5 to be of any use though?  That's not going
to be in fc until it's out of beta I guess.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Re: [autofs] Bad race condition in the new autofs protocol somewhere

2007-02-13 Thread Olivier Galibert

On Tue, Feb 13, 2007 at 09:52:39AM +0900, Ian Kent wrote:
> Indeed.
> Which kernel can you use?
> I believe that 2200 had another problem so can you use an fc5 kernel
> later than that?

I've ported your patch to 2257 (nothing special, only moved lines),
and it seems to work beautifully.  I'm enlarging the testing.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Re: [autofs] Bad race condition in the new autofs protocol somewhere

2007-02-12 Thread Olivier Galibert

On Mon, Feb 12, 2007 at 03:43:14PM +0900, Ian Kent wrote:
> On Thu, 2007-02-08 at 11:33 +0900, Ian Kent wrote:
> > On Wed, 2007-02-07 at 19:18 +0100, Olivier Galibert wrote:
> > > On Thu, Feb 08, 2007 at 03:07:41AM +0900, Ian Kent wrote:
> > > > It may be better to update to a later kernel so I don't have to port the
> > > > patch to several different kernels. Is that possible?
> > > 
> > > Sure, 2.6.20 or -git?
> > 
> > 2.6.20 has all the patches I've proposed so far except for the one we're
> > working on so that would be best for me.
> > 
> > Seems there may still be a problem with the patch so I'll let you know
> > what's happening as soon as I can.
> 
> I think I'm just about done.
> 
> Could you try using the two patches here against 2.6.20 please:

The patch works beautifully, no more failures with my test rig, until
the point where the kernel crashes.  Since the crashes happen without
the patch too, you're off the hook, but that means I can't deploy it
for harsher testing yet.

No wonder Dave Jones is prudent about updating kernels in fc :-)

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [autofs] Bad race condition in the new autofs protocol somewhere

2007-02-07 Thread Olivier Galibert

On Thu, Feb 08, 2007 at 03:07:41AM +0900, Ian Kent wrote:
> It may be better to update to a later kernel so I don't have to port the
> patch to several different kernels. Is that possible?

Sure, 2.6.20 or -git?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-02-07 Thread Olivier Galibert

On Thu, Jan 18, 2007 at 03:08:46PM +0100, Olivier Galibert wrote:
> On Mon, Jan 15, 2007 at 11:14:52PM +, Alan wrote:
> > > Both smart and the internal blade diagnostics say "everything is a-ok
> > > with the drive, there hasn't been any error ever except a bunch of
> > > corrected ECC ones, and no more than with a similar drive in another
> > > working blade".  Hence my initial post.  "Hardware error" is kinda
> > > imprecise, so I was wondering whether it was unexpected controller
> > > answer, detected transmission error, block write error, sector not
> > > found...  Is there a way to have more information?
> > 
> > Well the right place to look would indeed have been the SMART data
> > providing the drive didn't get into a state it couldn't update it.
> > Hardware error comes from the drive deciding something is wrong (or a
> > raid card faking it I guess). That covers everything from power
> > fluctuations and overheating through firmware consistency failures and
> > more.
> > 
> > If you pull the drive and test it in another box does it show the same ?
> 
> Ok, inverted the disks, got a crash of the same blade with the new
> disk, so the problem is not the drive itself.  Gonna try inverting two
> blades to check if it's the power supply connector/rail.

...and it is the power supply/connector.  Failure is linked to the
position of the blade in the box (as in the blade in the first
position always fails).  Now that's a cute failure.  Having the
support act on it is going to be fun.

  OG.

PS: Yes, I did forget to send that email :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Bad race condition in the new autofs protocol somewhere

2007-02-07 Thread Olivier Galibert

The setup:
  /people is a NIS automount.  /people/gadda points to m179:/disk05/disk11/gadda
  /hosts is a two-level automount, /hosts/xx/yy points to xx:/yy using:

in auto.master:
  /hosts file:/etc/auto.hosts

in /etc/auto.hosts:
  * -fstype=autofs,-Dhost=& file=/etc/auto.hosts.sub

in /etc/auto.hosts.sub:
  * ${host}:/&

/people/gadda/normalisation is a symlink to /hosts/m179/disk03/gadda/lemonde

I have a small test program:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
  for(;;) {
struct timeval tv1, tv2;
struct timespec ts;
gettimeofday(&tv1, 0);
int fd = open("/people/gadda/normalisation/tempo4.general", O_RDONLY);
gettimeofday(&tv2, 0);
if(fd < 0)
  printf("%d.%06d - %d.%06d failure\n", (int)tv1.tv_sec, (int)tv1.tv_usec, 
(int)tv2.tv_sec, (int)tv2.tv_usec);
else
  close(fd);

ts.tv_sec = 0;
ts.tv_nsec = lrand48() % 10;
nanosleep(&ts, 0);
  }
}

I.e. try opening an existing file "through" the symlink and the
automounts, give the time before and after the open syscall if it
fails.  Wait for a random time no more than 0.1s and try again.

Actually the random time is no more than 0.1ms, oops.  Whatever.

Then I start the automounter with a 1 second timeout, and I get random
failures.

1170867758.212109 - 1170867758.222789 failure
1170867758.668086 - 1170867758.678607 failure
1170867760.007998 - 1170867760.057324 failure

The log from an extra-verbose, but otherwise unchanged, git-of-today
version of autofs is:

Feb  7 18:02:24 m78 automount[16532]: Starting automounter version 5.0.1-rc3, 
master map auto.master
Feb  7 18:02:24 m78 automount[16532]: using kernel protocol version 5.00
Feb  7 18:02:24 m78 automount[16532]: mounted indirect mount on /hosts with 
timeout 1, freq 1 seconds
Feb  7 18:02:24 m78 automount[16532]: mounted indirect mount on /people with 
timeout 1, freq 1 seconds
Feb  7 18:02:24 m78 automount[16532]: mounted indirect mount on /corpora with 
timeout 1, freq 1 seconds
Feb  7 18:02:24 m78 automount[16532]: mounted indirect mount on /w3 with 
timeout 1, freq 1 seconds
Feb  7 18:02:24 m78 automount[16532]: mounted indirect mount on /vol with 
timeout 1, freq 1 seconds
Feb  7 18:02:33 m78 automount[16532]: 1170867753.258726 got packet type 3
Feb  7 18:02:33 m78 automount[16532]: attempting to mount entry /people/gadda
Feb  7 18:02:33 m78 automount[16532]: mount(nfs): mounted 
m179:/disk05/disk11/gadda on /people/gadda
Feb  7 18:02:33 m78 automount[16532]: mounted /people/gadda
Feb  7 18:02:33 m78 automount[16532]: 1170867753.355598 got packet type 3
Feb  7 18:02:33 m78 automount[16532]: attempting to mount entry /hosts/m179
Feb  7 18:02:33 m78 automount[16532]: mounted indirect mount on /hosts/m179 
with timeout 1, freq 1 seconds
Feb  7 18:02:33 m78 automount[16532]: mounted /hosts/m179
Feb  7 18:02:33 m78 automount[16532]: 1170867753.456997 got packet type 3
Feb  7 18:02:33 m78 automount[16532]: attempting to mount entry 
/hosts/m179/disk03
Feb  7 18:02:33 m78 automount[16532]: mount(nfs): mounted m179:/disk03 on 
/hosts/m179/disk03
Feb  7 18:02:33 m78 automount[16532]: mounted /hosts/m179/disk03
Feb  7 18:02:35 m78 automount[16532]: mount still busy /people
Feb  7 18:02:35 m78 automount[16532]: mount still busy /hosts/m179
Feb  7 18:02:36 m78 automount[16532]: mount still busy /hosts
Feb  7 18:02:37 m78 automount[16532]: mount still busy /people
Feb  7 18:02:38 m78 automount[16532]: 1170867758.212137 got packet type 4
Feb  7 18:02:38 m78 automount[16532]: 1170867758.212282 expiring path 
/people/gadda
Feb  7 18:02:38 m78 automount[16532]: 1170867758.212983 unmounting dir = 
/people/gadda
Feb  7 18:02:38 m78 automount[16532]: 1170867758.222742 expired /people/gadda
Feb  7 18:02:38 m78 automount[16532]: 1170867758.224147 got packet type 3
Feb  7 18:02:38 m78 automount[16532]: attempting to mount entry /people/gadda
Feb  7 18:02:38 m78 automount[16532]: mount(nfs): mounted 
m179:/disk05/disk11/gadda on /people/gadda
Feb  7 18:02:38 m78 automount[16532]: mounted /people/gadda
Feb  7 18:02:38 m78 automount[16532]: 1170867758.668221 got packet type 4
Feb  7 18:02:38 m78 automount[16532]: 1170867758.668379 expiring path 
/hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: 1170867758.669053 unmounting dir = 
/hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: 1170867758.678567 expired 
/hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: 1170867758.680159 got packet type 3
Feb  7 18:02:38 m78 automount[16532]: attempting to mount entry 
/hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: mount(nfs): mounted m179:/disk03 on 
/hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: mounted /hosts/m179/disk03
Feb  7 18:02:38 m78 automount[16532]: mount still busy /hosts/m179
Feb  7 18:02:39 m78 automount[16532]: mount still busy /people
Feb  7 18:02:39 m78 automount[16532]: mount still busy /hosts
Feb  7 18:02:40 m78 automount[16532]: 1170867760.004392 got pa

Re: 2.6.20-rc6-mm3

2007-01-30 Thread Olivier Galibert

On Tue, Jan 30, 2007 at 01:26:31AM -0800, Andrew Morton wrote:
> Len, what was in that merge anyway?  Lots of renaming and shuffling things
> around - the sorts of things which are safe as long as they compile OK.  But
> was there much substantive material in there as well?

It seems heavy in general, but the intersection with mmconfig looks
rather limited:
- s/acpi_table_mcfg_config/acpi_mcfg_allocation/
- s/base_address/address/
- s/pci_segment_group_number/pci_segment/
- address is now 64 bits

The last point is both good and bad.  The i965 needs it (good), I
don't know if the mapping functions can handle actual 64bits addresses
(maybe bad), especially on i386.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm3

2007-01-30 Thread Olivier Galibert

On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote:
> -x86_64-mm-share-whats-shareable.patch
> -x86_64-mm-only-call-unreachable_devices-when-type-1-is-available.patch
> -x86_64-mm-only-map-whats-necessary.patch
> -x86_64-mm-detect-and-support-the-e7520-and-the-945g-gz-p-pl.patch
> -x86_64-mm-reserve-resources-but-only-when-were-sure-about-them.patch

Want me to update these?  And maybe the other mmconfig related ones if
I can find them.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-01-18 Thread Olivier Galibert

On Mon, Jan 15, 2007 at 11:14:52PM +, Alan wrote:
> > Both smart and the internal blade diagnostics say "everything is a-ok
> > with the drive, there hasn't been any error ever except a bunch of
> > corrected ECC ones, and no more than with a similar drive in another
> > working blade".  Hence my initial post.  "Hardware error" is kinda
> > imprecise, so I was wondering whether it was unexpected controller
> > answer, detected transmission error, block write error, sector not
> > found...  Is there a way to have more information?
> 
> Well the right place to look would indeed have been the SMART data
> providing the drive didn't get into a state it couldn't update it.
> Hardware error comes from the drive deciding something is wrong (or a
> raid card faking it I guess). That covers everything from power
> fluctuations and overheating through firmware consistency failures and
> more.
> 
> If you pull the drive and test it in another box does it show the same ?

Ok, inverted the disks, got a crash of the same blade with the new
disk, so the problem is not the drive itself.  Gonna try inverting two
blades to check if it's the power supply connector/rail.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] update MMConfig patches w/915 support

2007-01-16 Thread Olivier Galibert

On Mon, Jan 08, 2007 at 01:27:15PM -0800, Jesse Barnes wrote:
> On Monday, January 8, 2007 12:45 pm, Olivier Galibert wrote:
> > On Sun, Jan 07, 2007 at 11:44:16AM -0800, Jesse Barnes wrote:
> > > For reference, here's the probe routine I tried for 965, probably
> > > something dumb wrong with it that I'm not seeing atm.
> >
> > It shouldn't have mattered in your case, but base_address is limited
> > to 32bits.  There is a 32 bits reserved zone after it so hope is not
> > to be lost, but in any case the current code can't handle over-4G
> > base addresses at that point.
> >
> > Does the bios or your '965 give a correct acpi mmconfig entry?
> 
> I should have captured the debug printk I put in while testing.  I think 
> the base address specified in the register was 0xf000, with a size 
> of 256M marked enabled.  Arjan points out that this likely conflicts 
> with other BIOS mappings, which is probably why I saw my machine hang 
> when I tried to use it.
> 
> As for ACPI, I assume you mean the MCFG table?  I haven't looked at it, 
> but the stock kernel complains about a lack of the MCFG range in the 
> e820 table and subsequently disables mmconfig.
> 
> But won't the bridge register value control what actually gets decoded?  
> If so, it sounds like this BIOS is buggy wrt mmconfig mapping in 
> general; good thing I'm not using any PCIe devices I guess...

Yeah.  I've checked the docs, I think I know what's going on.  On one
hand, if the chipset is configured to have the range somewhere, it is
decoded before anything external to the chipset, be it ram or mmaped
i/o.  So the information you get from the chipset should not be able
to conflict with anything by definition, it's the anything that
wouldn't be visible.

But in your case of a f000- mapping, something else
interesting is going on: it's conflicting with other internal
registers of the chipset, which, being fixed address, probably have
priority.  So you probably have to either reduce the range so that the
chipset registers aren't touched, or drop mmconfig if the address is
f000.

Technically, we can have the exact same problem with the other
chipsets.  BIOSen suck.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] MMCONFIG: Reject a broken MCFG tables on Asus etc

2007-01-16 Thread Olivier Galibert

On Sun, Jan 14, 2007 at 06:27:18AM +0900, OGAWA Hirofumi wrote:
> This rejects a broken MCFG tables on Asus etc.
> Arjan and Andi suggest this.

And I agree completely with the principle.  If you don't know the
chipset on a first-name basis, trash the MCFG unless it's squeaky
clean (or you don't have a choice).

> +static void __init pci_mmcfg_reject_broken(void)
> +{
> + struct acpi_table_mcfg_config *cfg = &pci_mmcfg_config[0];
> +
> + /*
> +  * Handle more broken MCFG tables on Asus etc.
> +  * They only contain a single entry for bus 0-0.
> +  */
> + if (pci_mmcfg_config_num == 1 &&
> + cfg->pci_segment_group_number == 0 &&
> + (cfg->start_bus_number | cfg->end_bus_number) == 0) {
> + kfree(pci_mmcfg_config);
> + pci_mmcfg_config = NULL;
> + pci_mmcfg_config_num = 0;
> +
> + printk(KERN_ERR "PCI: start and end of bus number is 0. "
> +"Rejected as broken MCFG.");
> + }
> +}
> +

If you're going to do a MCFG validation function, and I don't have a
problem with that, you should put the e820 test in it too.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-01-16 Thread Olivier Galibert

On Tue, Jan 16, 2007 at 03:47:52PM +, Alan wrote:
> The drives do that automatically, and the SCSI verify did it for him too
> if there were any other problems.

The SCSI verify didn't see a thing, I'm gonna do the disk swapping
dance.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-01-15 Thread Olivier Galibert

On Mon, Jan 15, 2007 at 11:14:52PM +, Alan wrote:
> If you pull the drive and test it in another box does it show the same ?

I'm going to try that.  The prolem requires 3-7 days to appear, so I
won't know immediatly.

> And what does a scsi verify have to say ?

Running, looks like it's gonna take a little while.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-01-15 Thread Olivier Galibert

On Tue, Jan 16, 2007 at 12:27:17AM +0100, Stefan Richter wrote:
> On 15 Jan, Olivier Galibert wrote:
> > sd 0:0:0:0: SCSI error: return code = 0x0802
> > sda: Current: sense key: Hardware Error
> > ASC=0x42 ASCQ=0x0
> 
> The Additional Sense Code means "power-on or self-test failure" FWIW.
> (SPC-4 annex D)

Given that happens between 3 days to a week after bootup on the root
drive, it's obviously not the "power on" part.  It's kinda annoying
nothing appears in the smart logs though:

smartctl version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: IBM-ESXS ST936701LCFN Version: B41D
Serial number: 3LC0C8P07647WLMV
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Tue Jan 16 00:33:09 2007 CET
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK

Current Drive Temperature: 33 C
Drive Trip Temperature:60 C
Elements in grown defect list: 0
Vendor (Seagate) cache information
  Blocks sent to initiator = 16206797
  Blocks received from initiator = 83607272
  Blocks read from cache and sent to initiator = 3311410
  Number of read and write commands whose size <= segment size = 2801896
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 533.07
  number of minutes until next internal SMART test = 112

Error counter log:
   Errors Corrected by   Total   Correction Gigabytes
Total
   ECC  rereads/errors   algorithm  processed
uncorrected
   fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  
errors
read:  104740 0 10474  10474 61.360 
  0
write: 00 0 0  0 58.647 
  2

Non-medium error count:  1457822

SMART Self-test log
Num  Test  Status segment  LifeTime  LBA_first_err 
[SK ASC ASQ]
 Description  number   (hours)
# 1  Background long   Completed   - 407 - 
[-   --]
# 2  Background short  Completed   - 243 - 
[-   --]

Long (extended) Self Test duration: 793 seconds [13.2 minutes]


  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What does this scsi error mean ?

2007-01-15 Thread Olivier Galibert

On Mon, Jan 15, 2007 at 06:45:40PM +, Alan wrote:
> On Mon, 15 Jan 2007 18:16:02 +0100
> Olivier Galibert <[EMAIL PROTECTED]> wrote:
> 
> > sd 0:0:0:0: SCSI error: return code = 0x0802
> > sda: Current: sense key: Hardware Error
> > ASC=0x42 ASCQ=0x0
> 
> I'll give you a clue: The words "Hardware Error".
> 
> Run a SCSI verify pass on the drive with some drive utilities and see
> what happens. If you are lucky it'll just reallocate blocks and decide
> the drive is ok, if not well see what the smart data thinks.

Both smart and the internal blade diagnostics say "everything is a-ok
with the drive, there hasn't been any error ever except a bunch of
corrected ECC ones, and no more than with a similar drive in another
working blade".  Hence my initial post.  "Hardware error" is kinda
imprecise, so I was wondering whether it was unexpected controller
answer, detected transmission error, block write error, sector not
found...  Is there a way to have more information?

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

What does this scsi error mean ?

2007-01-15 Thread Olivier Galibert

sd 0:0:0:0: SCSI error: return code = 0x0802
sda: Current: sense key: Hardware Error
ASC=0x42 ASCQ=0x0
Info fld=0x400802c
end_request: I/O error, dev sda, sector 202369
Aborting journal on device sda1.
journal commit I/O error
ext3_abort called.
EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only


It's always on a journal write and smart on the disk doesn't see a
thing (no error log, short and long smart tests pass).

In case it is relevant (it's an IBM LS20 blade):
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:04.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 
7000/VE]
02:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704S Gigabit 
Ethernet (rev 10)
02:01.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704S Gigabit 
Ethernet (rev 10)
02:02.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X 
Fusion-MPT Dual Ultra320 SCSI (rev 08)

ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: IBM-ESXS Model: ST936701LCFN Rev: B41D
  Type:   Direct-AccessANSI SCSI revision: 04

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI mmconfig support for Intel 915 bridges

2007-01-11 Thread Olivier Galibert

On Wed, Jan 10, 2007 at 06:53:03PM -0800, Jesse Barnes wrote:
> This is a resend of the patch I sent earlier to Oliver.  It adds support
> for Intel 915 bridge chips to the new PCI MMConfig detection code.  Tested
> and works on my sole 915 based platform (a Toshiba laptop).  I added
> register masking per Oliver's suggestion, and moved the __init qualifier to
> after the 'static const char' to match Ogawa-san's recent cleanup patches.
> 
> Over time we can probably associate more PCI IDs with this routine, since
> i915 family contains a few other chips.  But since I didn't have platforms
> to test such additions on, they're left out for now.
> 
> Signed-off-by:  Jesse Barnes <[EMAIL PROTECTED]>

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>

Andrew, you sent me a series of emails to tell me the patches had
moved to another subsystem tree, can you tell me which one?  There is
at least an anti-regression patch to add on top (the Asus/Nvidia
special case is causing oopses right now).

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] update MMConfig patches w/915 support

2007-01-08 Thread Olivier Galibert

On Sun, Jan 07, 2007 at 11:44:16AM -0800, Jesse Barnes wrote:
> For reference, here's the probe routine I tried for 965, probably something 
> dumb wrong with it that I'm not seeing atm.

It shouldn't have mattered in your case, but base_address is limited
to 32bits.  There is a 32 bits reserved zone after it so hope is not
to be lost, but in any case the current code can't handle over-4G base
addresses at that point.

Does the bios or your '965 give a correct acpi mmconfig entry?

  OG.

> P.S.  Hooray for Intel for publishing their bridge specs!  Makes stuff like 
> this a bit easier.

Ohhh yes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] update MMConfig patches w/915 support

2007-01-08 Thread Olivier Galibert

On Sun, Jan 07, 2007 at 11:42:09AM -0800, Jesse Barnes wrote:
> This patch updates Oliver's MMConfig bridge detection patches with support
> for 915G bridges.  It seems to work ok on my 915GM laptop.

Looks ok to me.


> I also tried adding 965 support, but it doesn't work (at least not on my
> G965 box).  When I enable MMConfig support when the register value is
> 0xf0003 (should be a 256M enabled window at 0xf000) the box hangs
> at boot, so I'm not sure what I'm doing wrong...
> 
> The routines could probably be consolidated into a single probe_intel_9xx
> routine or something, but I really looked at that yet (though there are
> many similarities between the 91[05], 945 and 965 families, they may not
> be enough that the code would actually be simpler if shared.

The individual functions are so simple, it's probably way better for
maintainance simplicity to keep them separate, at least for now.


> + pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
> +
> + /* No enable bit or size field, so assume 256M range is enabled. */
> + len = 0x1000U;
> + pci_mmcfg_config_num = 1;
> +
> + pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), GFP_KERNEL);
> + pci_mmcfg_config[0].base_address = pciexbar;

Hmmm, I'd mask out the reserved bits if I were you.  Paranoia :-)

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] MMCONFIG: Fix x86_64 ioremap base_address

2006-12-25 Thread Olivier Galibert

Sorry I missed the original email, but what is the chipset (name, pci
ID) of the board(s) with the problematic bios?

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Network drivers that don't suspend on interface down

2006-12-20 Thread Olivier Galibert

On Wed, Dec 20, 2006 at 04:34:17PM +0100, Arjan van de Ven wrote:
> 5 seconds is unfair and unrealistic though. The *hardware* negotiation
> before link is seen can easily take upto 45 seconds already.
> That's a network topology/hardware issue (spanning tree fun) that
> software or even the hardware in your PC can do nothing about.

It's about ergonomics, not technical capabilities or fairness.

> this means that the "power up time" needs to be at least 45 seconds, if
> it's then down 5 seconds inbetween... that's not real power savings.

Then that means you can't have usable autodetection and power savings
at the same time.  That's a pefectly acceptable answer, you just have
to give the choice between the two to the user.  From the kernel
p.o.v, it just means that you probably need 3 modes:
1- active and exchanging packets

2- inactive but waiting for plugging and able to tell something is
   going on fast (like 0.5s fast)

3- powered off

and they probably already exist (UP+addr/procmisc. set, UP and DOWN).
And if the second mode can't be lower power than the first, that's
just life.  An hypothetical mode 4 identical to 2 without the "fast"
part is just not worth bothering with.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Network drivers that don't suspend on interface down

2006-12-20 Thread Olivier Galibert

On Wed, Dec 20, 2006 at 02:38:51PM +0100, Arjan van de Ven wrote:
> [1] What kind of latency would be allowed? Would an implementation be
> allowed to power up the phy say once per minute or once per 5 minutes to
> see if there is link? The implementation could do this progressively;
> first poll every X seconds, then after an hour, every minute etc.

I suspect that the hard maximum latency is the time needed by the user
to start the network himself, be it opening a root xterm and doing the
appropriate invocation or pulling up and clicking where appropriate in
a GUI.  That's probably around 5 seconds.  Over that, and they won't
even notice there is an autodetection running.

But still, 5 seconds is probably too much too, because it's going to
look like it's unreliable.  The user has to see something happen
within half-a-second or so, otherwise he's going to start doing it by
hand.  The "see" part is distribution/desktop-dependant and not the
kernel problem, but the top chrono happens when the rj45 is plugged
in.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Detecting disk I/O errors

2006-12-18 Thread Olivier Galibert

Is there a way to know if there has been I/O error(s) on a specific
disk or partition since boot other than parsing dmesg and hoping it's
both still there and in the expected format?

Of course that's if the error didn't kill the system in the first
place :-)

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Olivier Galibert

;
  }
}

*offset += size;
return size;
  }
  case MODE_MAILBOX: {
size_t cur_size = size;
for(;;) {
  unsigned int val;
  if(get_user(val, (unsigned int *)buf))
	return -EFAULT;
  iiadc64_mailbox_write(is.access_mbox, val);
  buf += sizeof(unsigned int);
  if(cur_size <= sizeof(unsigned int))
	break;
  cur_size -= sizeof(unsigned int);
}
*offset += size;
return size;
  }
  default:
return -ENOTTY;
  }
}


#ifdef MODULE
int init_module(void)
#else
  int iiadc64_init(void)
#endif
{
  DEBUG_AL("IIADC64 minimal kernel driver (c) 1999 Olivier Galibert\n");

  is.dev = pci_find_device(0x10e8, 0x807f, 0);
  if(!is.dev) {
DEBUG_ER("iiadc64: Unable to find the DSP board\n");
return -EIO;
  }

#if LINUX_VERSION_CODE >= 0x20400
  if(pci_enable_device(is.dev))
return -EIO;

  pci_set_master(is.dev);

  is.iobase = pci_resource_start(is.dev, 0);
#else
  is.iobase = is.dev->base_address[0] & PCI_BASE_ADDRESS_IO_MASK;
#endif
  is.irq = is.dev->irq;
  is.memory = rvmalloc(RING_SIZE);

  if(!is.memory) {
DEBUG_ER("iiadc64: Couldn't allocate ring buffer\n");
return -EIO;
  }

  if (register_chrdev(II_MAJOR, "iiadc64", &iiadc64_fops)) {
DEBUG_ER("iiadc64: Unable to register character device\n");
return -EIO;
  }

  DEBUG_IN("iiadc64: DSP board found, io at 0x%lx, irq %u\n", is.iobase, is.irq);

#if LINUX_VERSION_CODE >= 0x20300
  init_waitqueue_head(&is.ring_wait);
#else
  is.ring_wait = 0;
  init_waitqueue(&is.ring_wait);
#endif

  is.lock = SPIN_LOCK_UNLOCKED;

  iiadc64_reset();

  return 0;
}

#ifdef MODULE
void cleanup_module(void)
{
  rvfree(is.memory, RING_SIZE);
  unregister_chrdev(II_MAJOR, "iiadc64");
}
#endif
//= Official Notice ===
//
// "This software was developed at the National Institute of Standards
// and Technology by employees of the Federal Government in the course of
// their official duties. Pursuant to Title 17 Section 105 of the United
// States Code this software is not subject to copyright protection and
// is in the public domain.
//
// The NIST Data Flow System (NDFS) is an experimental system and is
// offered AS IS. NIST assumes no responsibility whatsoever for its use
// by other parties, and makes no guarantees and NO WARRANTIES, EXPRESS
// OR IMPLIED, about its quality, reliability, fitness for any purpose,
// or any other characteristic.
//
// We would appreciate acknowledgement if the software is used.
//
// This software can be redistributed and/or modified freely provided
// that any derivative works bear some notice that they are derived from
// it, and any modified versions bear some notice that they have been
// modified from the original."
//
//=



#ifndef __IIADC64_H
#define __IIADC64_H

#include 

#define II_MAJOR 121

#define II_RESET		_IO(II_MAJOR, 0)
#define II_MODE_MEMORY		_IO(II_MAJOR, 1)
#define II_MODE_RING		_IO(II_MAJOR, 2)
#define II_MODE_MAILBOX		_IOR(II_MAJOR, 3, int)
#define II_PCI_TABLE		_IOR(II_MAJOR, 4, unsigned int)
#define II_RUN			_IOR(II_MAJOR, 5, unsigned int)

#endif

Re: GPL only modules [was Re: [GIT PATCH] more Driver core patches for 2.6.19]

2006-12-14 Thread Olivier Galibert

On Thu, Dec 14, 2006 at 11:11:33AM -0500, Dave Jones wrote:
> On Thu, Dec 14, 2006 at 04:05:14PM +0100, Adrian Bunk wrote:
>  > If a kernel developer or a competitor sends a cease&desist letter to 
>  > such a distribution, the situation changes from a complicated "derived 
>  > work" discussion to a relatively clear "They circumvented a technical 
>  > measure to enforce the copyright.".
> 
> C&D's don't work that way.  They can enforce "don't ship my code"
> but not "ship my code, or else".  The modification would be just like
> any other thats allowable by the GPL.

Careful here.  The "technical measure" protection is something
unrelated to the copyright license itself.  Cf the streambox vcr
lawsuit for instance (settled though) where not implementing the
handling of one bit that said "don't save to disk" in original code
seemed to be illegal.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] more Driver core patches for 2.6.19

2006-12-14 Thread Olivier Galibert

On Thu, Dec 14, 2006 at 10:56:03AM +0100, Hans-Jürgen Koch wrote:
> A small German manufacturer produces high-end AD converter cards. He sells
> 100 pieces per year, only in Germany and only with Windows drivers. He would
> now like to make his cards work with Linux. He has two driver programmers
> with little experience in writing Linux kernel drivers. What do you tell him?
> Write a large kernel module from scratch? Completely rewrite his code 
> because it uses floating point arithmetics?

Write a small kernel module which:
- create a device node per-card
- read the data from the A/D as fast as possible and buffer it in main
  memory without touching it
- implements a read interface to read data from the buffer
- implement ioctls for whatever controls you need

And that's it.  All the rest can be done in userspace, safely, with
floating point, C++ and everything.  If the driver programmers are
worth their pay, their driver is probably already split logically at
where the userspace-kernel interface would be.

And small means small, like 200 lines or so, more if you want to have
fun with sysfs, poll, aio and their ilk, but that's not a necessity.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ReiserFS corruption with 2.6.19 (Was 2.6.19 is not stable with SATA and should not be used by any meansis not stable with SATA and should not be used by any means)

2006-12-12 Thread Olivier Galibert

On Tue, Dec 12, 2006 at 11:44:18AM -0700, Andrew Robinson wrote:
> When I said hibernate, I did mention it was to disk, not to ram.

Suspend to disk is not trustable on Linux, and does not look like it
will be any time soon.  Suspend to ram has a better chance of becoming
reliable, but at that point is not ide/sata compatible, and X will
keep making things hard.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sysfs file creation result nightmare (WAS radeonfb: Fix sysfs_create_bin_file warnings)

2006-12-09 Thread Olivier Galibert

On Sat, Dec 09, 2006 at 01:58:29PM -0800, Andrew Morton wrote:
> On Sat, 9 Dec 2006 22:44:53 +0100
> Olivier Galibert <[EMAIL PROTECTED]> wrote:
> > Hmmm, I don't understand.  Which is the bug, having a sysfs file
> > creation fail or going on if it happens?
> 
> Probably the former, probably the latter.
> 
> There may be situations in which we want do to "create this sysfs file if
> it doesn't already exist", but I'm not aware of any such.
> 
> Generally speaking, if sysfs file creation went wrong, it's due to a bug. 
> The result is that the driver isn't working as intended: tunables or
> instrumentation which it is designed to make available are not present.  We
> want to know about that bug asap so we can get it fixed.

Hmmm, then why don't you just drop the return value from the creation
function and BUG() in there is something went wrong.  That would allow
for better error messages too.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sysfs file creation result nightmare (WAS radeonfb: Fix sysfs_create_bin_file warnings)

2006-12-09 Thread Olivier Galibert

On Sat, Dec 09, 2006 at 12:38:17PM -0800, Andrew Morton wrote:
> On Sun, 10 Dec 2006 06:59:10 +1100
> Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
> > Why would I prevent the framebuffer from initializing (and thus a
> > console to be displayed at all on many machines) just because for some
> > reason, I couldn't create a pair of EDID files in sysfs that are not
> > even very useful anymore ?
> 
> Because there's a bug in your kernel.  We don't hide and work around bugs.

Hmmm, I don't understand.  Which is the bug, having a sysfs file
creation fail or going on if it happens?

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: warning at drivers/scsi/ahci.c:859/ahci_host_intr() [ 2.6.17.14 ]

2006-12-08 Thread Olivier Galibert

On Sat, Dec 09, 2006 at 01:18:30AM +, Alan wrote:
> On Fri, 8 Dec 2006 20:05:07 -0500
> koan <[EMAIL PROTECTED]> wrote:
> 
> > ata4: status=0x50 { DriveReady SeekComplete }
> > ata4: error=0x01 { AddrMarkNotFound }
> 
> That looks like a genuine drive problem.

Is a disk driver supposed to BUG() on a drive missing sector though?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [Bluetooth] Add support for another Kensington dongle

2006-12-08 Thread Olivier Galibert

Add the stupid sco fixup quirk to yet another Broadcom/Kensington
device.
---
 drivers/bluetooth/hci_usb.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/bluetooth/hci_usb.c b/drivers/bluetooth/hci_usb.c
index fdea58a..aeefec9 100644
--- a/drivers/bluetooth/hci_usb.c
+++ b/drivers/bluetooth/hci_usb.c
@@ -126,6 +126,7 @@ static struct usb_device_id blacklist_ids[] = {
 
/* Kensington Bluetooth USB adapter */
{ USB_DEVICE(0x047d, 0x105d), .driver_info = HCI_RESET },
+   { USB_DEVICE(0x047d, 0x105e), .driver_info = HCI_WRONG_SCO_MTU },
 
/* ISSC Bluetooth Adapter v3.1 */
{ USB_DEVICE(0x1131, 0x1001), .driver_info = HCI_RESET },
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] PCI MMConfig: Detect and support the E7520 and the 945G/GZ/P/PL

2006-12-07 Thread Olivier Galibert

It seems that the only way to reliably support mmconfig in the
presence of funky biosen is to detect the hostbridge and read where
the window is mapped from its registers.  Do that for the E7520 and
the 945G/GZ/P/PL for a start.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |  116 ++-
 1 files changed, 113 insertions(+), 3 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 7b19639..302d495 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -3,6 +3,7 @@
  * MMCONFIG - common code between i386 and x86-64.
  * 
  * This code does:
+ * - known chipset handling
  * - ACPI decoding and validation
  *
  * Per-architecture code takes care of the mappings and accesses
@@ -55,12 +56,121 @@ static __init void unreachable_devices(void)
}
 }
 
+static __init const char *pci_mmcfg_e7520(void)
+{
+   u32 win;
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+
+   pci_mmcfg_config_num = 1;
+   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), GFP_KERNEL);
+   pci_mmcfg_config[0].base_address = (win & 0xf000) << 16;
+   pci_mmcfg_config[0].pci_segment_group_number = 0;
+   pci_mmcfg_config[0].start_bus_number = 0;
+   pci_mmcfg_config[0].end_bus_number = 255;
+
+   return "Intel Corporation E7520 Memory Controller Hub";
+}
+
+static __init const char *pci_mmcfg_intel_945(void)
+{
+   u32 pciexbar, mask = 0, len = 0;
+
+   pci_mmcfg_config_num = 1;
+
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+
+   /* Enable bit */
+   if (!(pciexbar & 1))
+   pci_mmcfg_config_num = 0;
+
+   /* Size bits */
+   switch ((pciexbar >> 1) & 3) {
+   case 0:
+   mask = 0xf000U;
+   len  = 0x1000U;
+   break;
+   case 1:
+   mask = 0xf800U;
+   len  = 0x0800U;
+   break;
+   case 2:
+   mask = 0xfc00U;
+   len  = 0x0400U;
+   break;
+   default:
+   pci_mmcfg_config_num = 0;
+   }
+
+   /* Errata #2, things break when not aligned on a 256Mb boundary */
+   /* Can only happen in 64M/128M mode */
+
+   if ((pciexbar & mask) & 0x0fffU)
+   pci_mmcfg_config_num = 0;
+
+   if (pci_mmcfg_config_num) {
+   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), 
GFP_KERNEL);
+   pci_mmcfg_config[0].base_address = pciexbar & mask;
+   pci_mmcfg_config[0].pci_segment_group_number = 0;
+   pci_mmcfg_config[0].start_bus_number = 0;
+   pci_mmcfg_config[0].end_bus_number = (len >> 20) - 1;
+   }
+
+   return "Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub";
+}
+
+struct pci_mmcfg_hostbridge_probe {
+   u32 vendor;
+   u32 device;
+   const char *(*probe)(void);
+};
+
+static __initdata struct pci_mmcfg_hostbridge_probe pci_mmcfg_probes[] = {
+   { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7520_MCH, pci_mmcfg_e7520 },
+   { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82945G_HB, 
pci_mmcfg_intel_945 },
+};
+
+static int __init pci_mmcfg_check_hostbridge(void)
+{
+   u32 l;
+   u16 vendor, device;
+   int i;
+   const char *name;
+
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+   vendor = l & 0x;
+   device = (l >> 16) & 0x;
+
+   pci_mmcfg_config_num = 0;
+   pci_mmcfg_config = NULL;
+   name = NULL;
+
+   for (i = 0; !name && i < sizeof(pci_mmcfg_probes) / 
sizeof(pci_mmcfg_probes[0]); i++)
+   if ((pci_mmcfg_probes[i].vendor == PCI_ANY_ID || 
pci_mmcfg_probes[i].vendor == vendor) &&
+   (pci_mmcfg_probes[i].device == PCI_ANY_ID || 
pci_mmcfg_probes[i].device == device))
+   name = pci_mmcfg_probes[i].probe();
+
+   if (name) {
+   if (pci_mmcfg_config_num)
+   printk(KERN_INFO "PCI: Found %s with MMCONFIG 
support.\n", name);
+   else
+   printk(KERN_INFO "PCI: Found %s without MMCONFIG 
support.\n", name);
+   }
+
+   return name != NULL;
+}
+
 void __init pci_mmcfg_init(int type)
 {
+   int known_bridge = 0;
+
if ((pci_probe & PCI_PROBE_MMCONF) == 0)
return;
 
-   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
+   if (type == 1 && pci_mmcfg_check_hostbridge())
+   known_bridge = 1;
+
+   if (!known_bridge)
+   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
 
if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@ -68,8 +178,8 @@ void __init pci_mmcfg_in

[PATCH 1/5] PCI MMConfig: Share what's shareable.

2006-12-07 Thread Olivier Galibert

i386 and x86-64 pci mmconfig code have a lot in common.  So share
what's shareable between the two.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/Makefile  |2 +-
 arch/i386/pci/mmconfig-shared.c |   86 +++
 arch/i386/pci/mmconfig.c|   74 ++---
 arch/i386/pci/pci.h |6 +++
 arch/x86_64/pci/Makefile|3 +-
 arch/x86_64/pci/mmconfig.c  |   71 +---
 6 files changed, 109 insertions(+), 133 deletions(-)

diff --git a/arch/i386/pci/Makefile b/arch/i386/pci/Makefile
index 1594d2f..44650e0 100644
--- a/arch/i386/pci/Makefile
+++ b/arch/i386/pci/Makefile
@@ -1,7 +1,7 @@
 obj-y  := i386.o init.o
 
 obj-$(CONFIG_PCI_BIOS) += pcbios.o
-obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o direct.o
+obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o direct.o mmconfig-shared.o
 obj-$(CONFIG_PCI_DIRECT)   += direct.o
 
 pci-y  := fixup.o
diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
new file mode 100644
index 000..b3ab210
--- /dev/null
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -0,0 +1,86 @@
+/*
+ * mmconfig-shared.c - Low-level direct PCI config space access via
+ * MMCONFIG - common code between i386 and x86-64.
+ * 
+ * This code does:
+ * - ACPI decoding and validation
+ *
+ * Per-architecture code takes care of the mappings and accesses
+ * themselves.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pci.h"
+
+/* aperture is up to 256MB but BIOS may reserve less */
+#define MMCONFIG_APER_MIN  (2 * 1024*1024)
+#define MMCONFIG_APER_MAX  (256 * 1024*1024)
+
+/* Verify the first 16 busses. We assume that systems with more busses
+   get MCFG right. */
+#define PCI_MMCFG_MAX_CHECK_BUS 16
+
+DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*PCI_MMCFG_MAX_CHECK_BUS);
+
+/* K8 systems have some devices (typically in the builtin northbridge)
+   that are only accessible using type1
+   Normally this can be expressed in the MCFG by not listing them
+   and assigning suitable _SEGs, but this isn't implemented in some BIOS.
+   Instead try to discover all devices on bus 0 that are unreachable using MM
+   and fallback for them. */
+static __init void unreachable_devices(void)
+{
+   int i, k;
+   /* Use the max bus number from ACPI here? */
+   for (k = 0; k < PCI_MMCFG_MAX_CHECK_BUS; k++) {
+   for (i = 0; i < 32; i++) {
+   u32 val1, val2;
+
+   pci_conf1_read(0, k, PCI_DEVFN(i,0), 0, 4, &val1);
+   if (val1 == 0x)
+   continue;
+   
+   raw_pci_ops->read(0, k, PCI_DEVFN(i, 0), 0, 4, &val2);
+   if (val1 != val2) {
+   set_bit(i + 32*k, pci_mmcfg_fallback_slots);
+   printk(KERN_NOTICE "PCI: No mmconfig possible"
+  " on device %02x:%02x\n", k, i);
+   }
+   }
+   }
+}
+
+void __init pci_mmcfg_init(int type)
+{
+   if ((pci_probe & PCI_PROBE_MMCONF) == 0)
+   return;
+
+   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
+
+   if ((pci_mmcfg_config_num == 0) ||
+   (pci_mmcfg_config == NULL) ||
+   (pci_mmcfg_config[0].base_address == 0))
+   return;
+
+   /* Only do this check when type 1 works. If it doesn't work
+   assume we run on a Mac and always use MCFG */
+   if (type == 1 &&
+   !e820_all_mapped(pci_mmcfg_config[0].base_address,
+pci_mmcfg_config[0].base_address + 
MMCONFIG_APER_MIN,
+E820_RESERVED)) {
+   printk(KERN_ERR "PCI: BIOS Bug: MCFG area at %x is not 
E820-reserved\n",
+   pci_mmcfg_config[0].base_address);
+   printk(KERN_ERR "PCI: Not using MMCONFIG.\n");
+   return;
+   }
+
+   if (pci_mmcfg_arch_init()) {
+   unreachable_devices();
+   pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
+   }
+}
diff --git a/arch/i386/pci/mmconfig.c b/arch/i386/pci/mmconfig.c
index c6b6d9b..507426f 100644
--- a/arch/i386/pci/mmconfig.c
+++ b/arch/i386/pci/mmconfig.c
@@ -15,20 +15,12 @@
 #include 
 #include "pci.h"
 
-/* aperture is up to 256MB but BIOS may reserve less */
-#define MMCONFIG_APER_MIN  (2 * 1024*1024)
-#define MMCONFIG_APER_MAX  (256 * 1024*1024)
-
 /* Assume systems with more busses have correct MCFG */
-#define MAX_CHECK_BUS 16
-
 #define mmcfg_virt_addr ((void __iomem *) fix_to_virt(FIX_PCIE_MCFG))
 
 /* The base address of the last MMCONFIG de

[PATCH 3/5] PCI MMConfig: Only map what's necessary.

2006-12-07 Thread Olivier Galibert

The x86-64 mmconfig code always map a range of MMCONFIG_APER_MAX
bytes, i.e. 256MB, whatever the number of accessible busses is.  Fix
it, and add the end of the zone in the printk while we're at it.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/x86_64/pci/mmconfig.c |   12 +---
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86_64/pci/mmconfig.c b/arch/x86_64/pci/mmconfig.c
index c71c181..3d13220 100644
--- a/arch/x86_64/pci/mmconfig.c
+++ b/arch/x86_64/pci/mmconfig.c
@@ -13,10 +13,6 @@
 
 #include "pci.h"
 
-/* aperture is up to 256MB but BIOS may reserve less */
-#define MMCONFIG_APER_MIN  (2 * 1024*1024)
-#define MMCONFIG_APER_MAX  (256 * 1024*1024)
-
 /* Verify the first 16 busses. We assume that systems with more busses
get MCFG right. */
 #define PCI_MMCFG_MAX_CHECK_BUS 16
@@ -143,17 +139,19 @@ int __init pci_mmcfg_arch_init(void)
}
 
for (i = 0; i < pci_mmcfg_config_num; ++i) {
+   u32 size = (pci_mmcfg_config[0].end_bus_number - 
pci_mmcfg_config[0].start_bus_number + 1) << 20;
pci_mmcfg_virt[i].cfg = &pci_mmcfg_config[i];
pci_mmcfg_virt[i].virt = 
ioremap_nocache(pci_mmcfg_config[i].base_address,
-MMCONFIG_APER_MAX);
+size);
if (!pci_mmcfg_virt[i].virt) {
printk(KERN_ERR "PCI: Cannot map mmconfig aperture for "
"segment %d\n",
   pci_mmcfg_config[i].pci_segment_group_number);
return 0;
}
-   printk(KERN_INFO "PCI: Using MMCONFIG at %x\n",
-  pci_mmcfg_config[i].base_address);
+   printk(KERN_INFO "PCI: Using MMCONFIG at %x-%x\n",
+  pci_mmcfg_config[i].base_address,
+  pci_mmcfg_config[i].base_address + size - 1);
}
 
raw_pci_ops = &pci_mmcfg;
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] PCI MMConfig: Only call unreachable_devices() when type 1 is available.

2006-12-07 Thread Olivier Galibert

unreachable_devices compares between the results of pci configuration
accesses through type1 and mmconfig, so it should be called only if
type1 actually works in the first place.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index b3ab210..7b19639 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -80,7 +80,8 @@ void __init pci_mmcfg_init(int type)
}
 
if (pci_mmcfg_arch_init()) {
-   unreachable_devices();
+   if (type == 1)
+   unreachable_devices();
pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
}
 }
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/5] PCI MMConfig: Reserve resources but only when we're sure about them.

2006-12-07 Thread Olivier Galibert

Put back the resource reservation as per
4c6e052adfe285ede5884e4e8c4d33af33932c13 but use it *only* when the
range(s) come from a chipset probe instead of the bios.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 302d495..0da1e3b 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -159,6 +159,37 @@ static int __init pci_mmcfg_check_hostbridge(void)
return name != NULL;
 }
 
+static __init void pci_mmcfg_insert_resources(void)
+{
+#define PCI_MMCFG_RESOURCE_NAME_LEN 19
+   int i;
+   struct resource *res;
+   char *names;
+   unsigned num_buses;
+
+   res = kcalloc(PCI_MMCFG_RESOURCE_NAME_LEN + sizeof(*res),
+   pci_mmcfg_config_num, GFP_KERNEL);
+
+   if (!res) {
+   printk(KERN_ERR "PCI: Unable to allocate MMCONFIG resources\n");
+   return;
+   }
+
+   names = (void *)&res[pci_mmcfg_config_num];
+   for (i = 0; i < pci_mmcfg_config_num; i++, res++) {
+   num_buses = pci_mmcfg_config[i].end_bus_number -
+   pci_mmcfg_config[i].start_bus_number + 1;
+   res->name = names;
+   snprintf(names, PCI_MMCFG_RESOURCE_NAME_LEN, "PCI MMCONFIG %u",
+   pci_mmcfg_config[i].pci_segment_group_number);
+   res->start = pci_mmcfg_config[i].base_address;
+   res->end = res->start + (num_buses << 20) - 1;
+   res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+   insert_resource(&iomem_resource, res);
+   names += PCI_MMCFG_RESOURCE_NAME_LEN;
+   }
+}
+
 void __init pci_mmcfg_init(int type)
 {
int known_bridge = 0;
@@ -192,6 +223,8 @@ void __init pci_mmcfg_init(int type)
if (pci_mmcfg_arch_init()) {
if (type == 1)
unreachable_devices();
+   if (known_bridge)
+   pci_mmcfg_insert_resources();
pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
}
 }
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[0/5] PCI MMConfig per-chipset support - v2

2006-12-07 Thread Olivier Galibert

I'll try to be less messy this time.

  OG.

1/5: PCI MMConfig: Share what's shareable.
  Share code between i386 and x86-64
 
2/5: PCI MMConfig: Only call unreachable_devices() when type 1 is available.
  Trivial fix.
 
3/5: PCI MMConfig: Only map what's necessary.
  Trivial fix too.
 
4/5: PCI MMConfig: Detect and support the E7520 and the 945G/GZ/P/PL
  The actual per-chipset support.
 
5/5: PCI MMConfig: Reserve resources but only when we're sure about them.
  Add the resources in /proc/iomem when the chipset in known.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] PCI MMConfig: Share what's shareable.

2006-12-07 Thread Olivier Galibert

On Thu, Dec 07, 2006 at 06:03:17PM +0200, Muli Ben-Yehuda wrote:
> On Thu, Dec 07, 2006 at 04:53:36PM +0100, Olivier Galibert wrote:
> 
> > # git grep '//' -- '*.c' |fgrep -v 'http://' |wc -l
> > 14333
> > 
> > You lost that war ages ago.  Come join us in this millenia,
> > line-comments exist officially in C since 1999, and were supported
> > way before that.
> 
> If I was bored, I might've counted how many /* */ style comments we
> had in the source,

426K or so.

> then used it to construct an elaborate argument why C++-style
> comments are evil and Conformance is Goodness, but I'm not, so I
> won't.

I've yet to see an actual technical argument against them.  I find
them more readable by making it perfectly obvious what their
application range is, contrary to /* where you need to find the
closing */.  But that's just me.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] PCI MMConfig: Share what's shareable.

2006-12-07 Thread Olivier Galibert

On Thu, Dec 07, 2006 at 05:35:06PM +0200, Muli Ben-Yehuda wrote:
> arch/i386/pci/pci.h seems the least-inappropriate.

Ok, will do.

> Also, forgot to mention, please get rid of C++ style comments in the
> code.

# git grep '//' -- '*.c' |fgrep -v 'http://' |wc -l
14333

You lost that war ages ago.  Come join us in this millenia,
line-comments exist officially in C since 1999, and were supported way
before that.

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] PCI MMConfig: Share what's shareable.

2006-12-07 Thread Olivier Galibert

On Thu, Dec 07, 2006 at 05:00:23PM +0200, Muli Ben-Yehuda wrote:
> On Thu, Dec 07, 2006 at 03:49:53PM +0100, Olivier Galibert wrote:
> 
> > +void __init pci_mmcfg_init(int type)
> > +{
> > +   extern int pci_mmcfg_arch_init(void);
> 
> Please put this in a suitable header file.

Sure, which ?

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PCI MMConfig: Only call unreachable_devices() when type 1 is available.

2006-12-07 Thread Olivier Galibert

unreachable_devices compares between the results of pci configuration
accesses through type1 and mmconfig, so it should be called only if
type1 actually works in the first place.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 4ca3f5a..7a8a498 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -82,7 +82,8 @@ void __init pci_mmcfg_init(int type)
}
 
if (pci_mmcfg_arch_init()) {
-   unreachable_devices();
+   if (type == 1)
+   unreachable_devices();
pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
}
 }
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [0/5] PCI MMConfig per-chipset support

2006-12-07 Thread Olivier Galibert

It seems that the only way to reliably support mmconfig in the
presence of funky biosen is to detect the hostbridge and read where
the window is mapped from its registers.  Do that for the E7520 and
the 945G/GZ/P/PL for a start.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |  114 ++-
 1 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 7a8a498..4906741 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -3,6 +3,7 @@
  * MMCONFIG - common code between i386 and x86-64.
  * 
  * This code does:
+ * - known chipset handling
  * - ACPI decoding and validation
  *
  * Per-architecture code takes care of the mappings and accesses
@@ -55,14 +56,123 @@ static __init void unreachable_devices(void)
}
 }
 
+static __init const char *pci_mmcfg_e7520(void)
+{
+   u32 win;
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+
+   pci_mmcfg_config_num = 1;
+   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), GFP_KERNEL);
+   pci_mmcfg_config[0].base_address = (win & 0xf000) << 16;
+   pci_mmcfg_config[0].pci_segment_group_number = 0;
+   pci_mmcfg_config[0].start_bus_number = 0;
+   pci_mmcfg_config[0].end_bus_number = 255;
+
+   return "Intel Corporation E7520 Memory Controller Hub";
+}
+
+static __init const char *pci_mmcfg_intel_945(void)
+{
+   u32 pciexbar, mask = 0, len = 0;
+
+   pci_mmcfg_config_num = 1;
+
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+
+   // Enable bit
+   if (!(pciexbar & 1))
+   pci_mmcfg_config_num = 0;
+
+   // Size bits
+   switch ((pciexbar >> 1) & 3) {
+   case 0:
+   mask = 0xf000U;
+   len  = 0x1000U;
+   break;
+   case 1:
+   mask = 0xf800U;
+   len  = 0x0800U;
+   break;
+   case 2:
+   mask = 0xfc00U;
+   len  = 0x0400U;
+   break;
+   default:
+   pci_mmcfg_config_num = 0;
+   }
+
+   // Errata #2, things break when not aligned on a 256Mb boundary
+   // Can only happen in 64M/128M mode
+
+   if ((pciexbar & mask) & 0x0fffU)
+   pci_mmcfg_config_num = 0;
+
+   if (pci_mmcfg_config_num) {
+   pci_mmcfg_config = kzalloc(sizeof(pci_mmcfg_config[0]), 
GFP_KERNEL);
+   pci_mmcfg_config[0].base_address = pciexbar & mask;
+   pci_mmcfg_config[0].pci_segment_group_number = 0;
+   pci_mmcfg_config[0].start_bus_number = 0;
+   pci_mmcfg_config[0].end_bus_number = (len >> 20) - 1;
+   }
+
+   return "Intel Corporation 945G/GZ/P/PL Express Memory Controller Hub";
+}
+
+struct pci_mmcfg_hostbridge_probe {
+   u32 vendor;
+   u32 device;
+   const char *(*probe)(void);
+};
+
+static __initdata struct pci_mmcfg_hostbridge_probe pci_mmcfg_probes[] = {
+   { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_E7520_MCH, pci_mmcfg_e7520 },
+   { PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82945G_HB, 
pci_mmcfg_intel_945 },
+};
+
+static int __init pci_mmcfg_check_hostbridge(void)
+{
+   u32 l;
+   u16 vendor, device;
+   int i;
+   const char *name;
+
+   pci_conf1_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+   vendor = l & 0x;
+   device = (l >> 16) & 0x;
+
+   pci_mmcfg_config_num = 0;
+   pci_mmcfg_config = NULL;
+   name = NULL;
+
+   for (i = 0; !name && i < sizeof(pci_mmcfg_probes) / 
sizeof(pci_mmcfg_probes[0]); i++)
+   if ((pci_mmcfg_probes[i].vendor == PCI_ANY_ID || 
pci_mmcfg_probes[i].vendor == vendor) &&
+   (pci_mmcfg_probes[i].device == PCI_ANY_ID || 
pci_mmcfg_probes[i].device == device))
+   name = pci_mmcfg_probes[i].probe();
+
+   if (name) {
+   if (pci_mmcfg_config_num)
+   printk(KERN_INFO "PCI: Found %s with MMCONFIG 
support.\n", name);
+   else
+   printk(KERN_INFO "PCI: Found %s without MMCONFIG 
support.\n", name);
+   }
+
+   return name != NULL;
+}
+
 void __init pci_mmcfg_init(int type)
 {
extern int pci_mmcfg_arch_init(void);
 
+   int known_bridge = 0;
+
if ((pci_probe & PCI_PROBE_MMCONF) == 0)
return;
 
-   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
+   if (type == 1 && pci_mmcfg_check_hostbridge())
+   known_bridge = 1;
+
+   if (!known_bridge)
+   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
 
if ((pci_mmcfg_config_num == 0) ||
(pci_mmcfg_config == NULL) ||
@@ -71,7 +1

[PATCH 1/5] PCI MMConfig: Share what's shareable.

2006-12-07 Thread Olivier Galibert

i386 and x86-64 pci mmconfig code have a lot in common.  So share
what's shareable between the two.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/Makefile  |2 +-
 arch/i386/pci/mmconfig-shared.c |   88 +++
 arch/i386/pci/mmconfig.c|   70 ++-
 arch/x86_64/pci/Makefile|3 +-
 arch/x86_64/pci/mmconfig.c  |   64 +++-
 5 files changed, 102 insertions(+), 125 deletions(-)

diff --git a/arch/i386/pci/Makefile b/arch/i386/pci/Makefile
index 1594d2f..44650e0 100644
--- a/arch/i386/pci/Makefile
+++ b/arch/i386/pci/Makefile
@@ -1,7 +1,7 @@
 obj-y  := i386.o init.o
 
 obj-$(CONFIG_PCI_BIOS) += pcbios.o
-obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o direct.o
+obj-$(CONFIG_PCI_MMCONFIG) += mmconfig.o direct.o mmconfig-shared.o
 obj-$(CONFIG_PCI_DIRECT)   += direct.o
 
 pci-y  := fixup.o
diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
new file mode 100644
index 000..4ca3f5a
--- /dev/null
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -0,0 +1,88 @@
+/*
+ * mmconfig-shared.c - Low-level direct PCI config space access via
+ * MMCONFIG - common code between i386 and x86-64.
+ * 
+ * This code does:
+ * - ACPI decoding and validation
+ *
+ * Per-architecture code takes care of the mappings and accesses
+ * themselves.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pci.h"
+
+/* aperture is up to 256MB but BIOS may reserve less */
+#define MMCONFIG_APER_MIN  (2 * 1024*1024)
+#define MMCONFIG_APER_MAX  (256 * 1024*1024)
+
+/* Verify the first 16 busses. We assume that systems with more busses
+   get MCFG right. */
+#define MAX_CHECK_BUS 16
+
+DECLARE_BITMAP(pci_mmcfg_fallback_slots, 32*MAX_CHECK_BUS);
+
+/* K8 systems have some devices (typically in the builtin northbridge)
+   that are only accessible using type1
+   Normally this can be expressed in the MCFG by not listing them
+   and assigning suitable _SEGs, but this isn't implemented in some BIOS.
+   Instead try to discover all devices on bus 0 that are unreachable using MM
+   and fallback for them. */
+static __init void unreachable_devices(void)
+{
+   int i, k;
+   /* Use the max bus number from ACPI here? */
+   for (k = 0; k < MAX_CHECK_BUS; k++) {
+   for (i = 0; i < 32; i++) {
+   u32 val1, val2;
+
+   pci_conf1_read(0, k, PCI_DEVFN(i,0), 0, 4, &val1);
+   if (val1 == 0x)
+   continue;
+   
+   raw_pci_ops->read(0, k, PCI_DEVFN(i, 0), 0, 4, &val2);
+   if (val1 != val2) {
+   set_bit(i + 32*k, pci_mmcfg_fallback_slots);
+   printk(KERN_NOTICE "PCI: No mmconfig possible"
+  " on device %02x:%02x\n", k, i);
+   }
+   }
+   }
+}
+
+void __init pci_mmcfg_init(int type)
+{
+   extern int pci_mmcfg_arch_init(void);
+
+   if ((pci_probe & PCI_PROBE_MMCONF) == 0)
+   return;
+
+   acpi_table_parse(ACPI_MCFG, acpi_parse_mcfg);
+
+   if ((pci_mmcfg_config_num == 0) ||
+   (pci_mmcfg_config == NULL) ||
+   (pci_mmcfg_config[0].base_address == 0))
+   return;
+
+   /* Only do this check when type 1 works. If it doesn't work
+   assume we run on a Mac and always use MCFG */
+   if (type == 1 &&
+   !e820_all_mapped(pci_mmcfg_config[0].base_address,
+pci_mmcfg_config[0].base_address + 
MMCONFIG_APER_MIN,
+E820_RESERVED)) {
+   printk(KERN_ERR "PCI: BIOS Bug: MCFG area at %x is not 
E820-reserved\n",
+   pci_mmcfg_config[0].base_address);
+   printk(KERN_ERR "PCI: Not using MMCONFIG.\n");
+   return;
+   }
+
+   if (pci_mmcfg_arch_init()) {
+   unreachable_devices();
+   pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
+   }
+}
diff --git a/arch/i386/pci/mmconfig.c b/arch/i386/pci/mmconfig.c
index c6b6d9b..36e1ba0 100644
--- a/arch/i386/pci/mmconfig.c
+++ b/arch/i386/pci/mmconfig.c
@@ -15,10 +15,6 @@
 #include 
 #include "pci.h"
 
-/* aperture is up to 256MB but BIOS may reserve less */
-#define MMCONFIG_APER_MIN  (2 * 1024*1024)
-#define MMCONFIG_APER_MAX  (256 * 1024*1024)
-
 /* Assume systems with more busses have correct MCFG */
 #define MAX_CHECK_BUS 16
 
@@ -27,7 +23,7 @@
 /* The base address of the last MMCONFIG device accessed */
 static u32 mmcfg_last_accessed_device;
 
-static DECLARE_BITMAP(fallback_slo

[PATCH 5/5] PCI MMConfig: Reserve resources but only when we're sure about them.

2006-12-07 Thread Olivier Galibert

Put back the resource reservation as per
4c6e052adfe285ede5884e4e8c4d33af33932c13 but use it *only* when the
range(s) come from a chipset probe instead of the bios.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/i386/pci/mmconfig-shared.c |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/i386/pci/mmconfig-shared.c b/arch/i386/pci/mmconfig-shared.c
index 4906741..7599b89 100644
--- a/arch/i386/pci/mmconfig-shared.c
+++ b/arch/i386/pci/mmconfig-shared.c
@@ -159,6 +159,37 @@ static int __init pci_mmcfg_check_hostbridge(void)
return name != NULL;
 }
 
+static __init void pci_mmcfg_insert_resources(void)
+{
+#define PCI_MMCFG_RESOURCE_NAME_LEN 19
+   int i;
+   struct resource *res;
+   char *names;
+   unsigned num_buses;
+
+   res = kcalloc(PCI_MMCFG_RESOURCE_NAME_LEN + sizeof(*res),
+   pci_mmcfg_config_num, GFP_KERNEL);
+
+   if (!res) {
+   printk(KERN_ERR "PCI: Unable to allocate MMCONFIG resources\n");
+   return;
+   }
+
+   names = (void *)&res[pci_mmcfg_config_num];
+   for (i = 0; i < pci_mmcfg_config_num; i++, res++) {
+   num_buses = pci_mmcfg_config[i].end_bus_number -
+   pci_mmcfg_config[i].start_bus_number + 1;
+   res->name = names;
+   snprintf(names, PCI_MMCFG_RESOURCE_NAME_LEN, "PCI MMCONFIG %u",
+   pci_mmcfg_config[i].pci_segment_group_number);
+   res->start = pci_mmcfg_config[i].base_address;
+   res->end = res->start + (num_buses << 20) - 1;
+   res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+   insert_resource(&iomem_resource, res);
+   names += PCI_MMCFG_RESOURCE_NAME_LEN;
+   }
+}
+
 void __init pci_mmcfg_init(int type)
 {
extern int pci_mmcfg_arch_init(void);
@@ -194,6 +225,8 @@ void __init pci_mmcfg_init(int type)
if (pci_mmcfg_arch_init()) {
if (type == 1)
unreachable_devices();
+   if (known_bridge)
+   pci_mmcfg_insert_resources();
pci_probe = (pci_probe & ~PCI_PROBE_MASK) | PCI_PROBE_MMCONF;
}
 }
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PCI MMConfig: Only map what's necessary.

2006-12-07 Thread Olivier Galibert

The x86-64 mmconfig code always map a range of MMCONFIG_APER_MAX
bytes, i.e. 256MB, whatever the number of accessible busses is.  Fix
it, and add the end of the zone in the printk while we're at it.

Signed-off-by: Olivier Galibert <[EMAIL PROTECTED]>
---
 arch/x86_64/pci/mmconfig.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86_64/pci/mmconfig.c b/arch/x86_64/pci/mmconfig.c
index b270f20..091f759 100644
--- a/arch/x86_64/pci/mmconfig.c
+++ b/arch/x86_64/pci/mmconfig.c
@@ -13,10 +13,6 @@
 
 #include "pci.h"
 
-/* aperture is up to 256MB but BIOS may reserve less */
-#define MMCONFIG_APER_MIN  (2 * 1024*1024)
-#define MMCONFIG_APER_MAX  (256 * 1024*1024)
-
 /* Verify the first 16 busses. We assume that systems with more busses
get MCFG right. */
 #define MAX_CHECK_BUS 16
@@ -145,16 +141,17 @@ int __init pci_mmcfg_arch_init(void)
}
 
for (i = 0; i < pci_mmcfg_config_num; ++i) {
+   u32 size = (pci_mmcfg_config[0].end_bus_number - 
pci_mmcfg_config[0].start_bus_number + 1) << 20;
pci_mmcfg_virt[i].cfg = &pci_mmcfg_config[i];
pci_mmcfg_virt[i].virt = 
ioremap_nocache(pci_mmcfg_config[i].base_address,
-MMCONFIG_APER_MAX);
+size);
if (!pci_mmcfg_virt[i].virt) {
printk(KERN_ERR "PCI: Cannot map mmconfig aperture for "
"segment %d\n",
   pci_mmcfg_config[i].pci_segment_group_number);
return 0;
}
-   printk(KERN_INFO "PCI: Using MMCONFIG at %x\n", 
pci_mmcfg_config[i].base_address);
+   printk(KERN_INFO "PCI: Using MMCONFIG at %x-%x\n", 
pci_mmcfg_config[i].base_address, pci_mmcfg_config[i].base_address + size - 1);
}
 
raw_pci_ops = &pci_mmcfg;
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[0/5] PCI MMConfig per-chipset support

2006-12-07 Thread Olivier Galibert

Done in 5 steps, at Andi's very reasonable request:

1/5: PCI MMConfig: Share what's shareable.
  Share code between i386 and x86-64

2/5: PCI MMConfig: Only call unreachable_devices() when type 1 is available.
  Trivial fix.

3/5: PCI MMConfig: Only map what's necessary.
  Trivial fix too.

4/5: PCI MMConfig: Detect and support the E7520 and the 945G/GZ/P/PL
  The actual per-chipset support.

5/5: PCI MMConfig: Reserve resources but only when we're sure about them.
  Add the resources in /proc/iomem when the chipset in known.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI MMConfig: Detect and support the E7520 and the 945G/GZ/P/PL

2006-11-30 Thread Olivier Galibert

On Mon, Nov 27, 2006 at 09:24:06PM +0100, Olivier Galibert wrote:
> On Mon, Nov 27, 2006 at 08:07:48PM +0100, Andi Kleen wrote:
> > Is that with just the code movement patch or your feature patch
> > added too? If the later can you test it with only code movement
> > (and compare against vanilla kernel). at least code movement
> > only should behave exactly the same as unpatched kernel.
> 
> You misread.  Unpatched kernel does not work.  That's why I gave the
> git reference of the kernel too.  Patched kernel does not work either,
> unsurprisingly (bios gives correct tables on that box).

Ok, I'm trying to debug it, and it's a pain.  It's a timing issue,
mmcfg write accesses are too slow for something.  The get_base_addr()
call is enough to slow things down too much, which explains why the
fundamentally simpler x86-64 code works without a hitch.

Finding out what it is too slow for, though, is an interesting
proposition.  It's not entirely obvious it is actually related to the
sata accesses.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI MMConfig: Detect and support the E7520 and the 945G/GZ/P/PL

2006-11-27 Thread Olivier Galibert

On Mon, Nov 27, 2006 at 08:07:48PM +0100, Andi Kleen wrote:
> Is that with just the code movement patch or your feature patch
> added too? If the later can you test it with only code movement
> (and compare against vanilla kernel). at least code movement
> only should behave exactly the same as unpatched kernel.

You misread.  Unpatched kernel does not work.  That's why I gave the
git reference of the kernel too.  Patched kernel does not work either,
unsurprisingly (bios gives correct tables on that box).

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 147 matches

Mail list logo