Re: [page-reclaim] Augmented Page Reclaim

2021-02-09 Thread Jesse Barnes
> ==
> Augmented Page Reclaim
> ==
> We would like to share a work with you and see if there is enough
> interest to warrant a run for the mainline. This work is a part of
> result from a decade of research and experimentation in memory
> overcommit at Google: an augmented page reclaim that, in our
> experience, is performant, versatile and, more importantly, simple.

Per discussion on IRC, maybe some additional background would help.

In looking at browser workloads on Chrome OS, we found that reclaim was:
1) too expensive in terms of CPU usage
2) often making poor decisions about what to reclaim

This work was mainly targeted toward improving those things, with an
eye toward interactive performance for browser workloads.

We have a few key tests we use for that, that measure tab switch times
and number of tab discards when under memory pressure, and this
approach significantly improves these (see Yu's data).

We do expect this approach will also be beneficial to cloud workloads,
and so are looking for people to try it out in their environments with
their favorite key tests or workloads.

Thanks,
Jesse


Re: [PATCH] x86/gpu: add JSL stolen memory support

2020-11-19 Thread Jesse Barnes
On Thu, Nov 19, 2020 at 11:19 AM Bjorn Helgaas  wrote:
>
> [+cc Jesse]
>
> On Thu, Nov 19, 2020 at 10:37:10AM +0100, Daniel Vetter wrote:
> > On Thu, Nov 19, 2020 at 12:14 AM Bjorn Helgaas  wrote:
> > > On Wed, Nov 18, 2020 at 10:57:26PM +0100, Daniel Vetter wrote:
> > > > On Wed, Nov 18, 2020 at 5:02 PM Bjorn Helgaas  
> > > > wrote:
> > > > > On Fri, Nov 06, 2020 at 10:39:16AM +0100, Daniel Vetter wrote:
> > > > > > On Thu, Nov 5, 2020 at 3:17 PM Bjorn Helgaas  
> > > > > > wrote:
> > > > > > > On Thu, Nov 05, 2020 at 11:46:06AM +0200, Joonas Lahtinen wrote:
> > > > > > > > Quoting Bjorn Helgaas (2020-11-04 19:35:56)
> > > > > > > > > [+cc Jani, Joonas, Rodrigo, David, Daniel]
> > > > > > > > >
> > > > > > > > > On Wed, Nov 04, 2020 at 05:35:06PM +0530, Tejas Upadhyay 
> > > > > > > > > wrote:
> > > > > > > > > > JSL re-uses the same stolen memory as ICL and EHL.
> > > > > > > > > >
> > > > > > > > > > Cc: Lucas De Marchi 
> > > > > > > > > > Cc: Matt Roper 
> > > > > > > > > > Signed-off-by: Tejas Upadhyay 
> > > > > > > > > > 
> > > > > > > > >
> > > > > > > > > I don't plan to do anything with this since previous similar 
> > > > > > > > > patches
> > > > > > > > > have gone through some other tree, so this is just kibitzing.
> > > > > > > > >
> > > > > > > > > But the fact that we have this long list of Intel devices [1] 
> > > > > > > > > that
> > > > > > > > > constantly needs updates [2] is a hint that something is 
> > > > > > > > > wrong.
> > > > > > > >
> > > > > > > > We add an entry for every new integrated graphics platform. 
> > > > > > > > Once the
> > > > > > > > platform is added, there have not been changes lately.
> > > > > > > >
> > > > > > > > > IIUC the general idea is that we need to discover Intel gfx 
> > > > > > > > > memory by
> > > > > > > > > looking at device-dependent config space and add it to the 
> > > > > > > > > E820 map.
> > > > > > > > > Apparently the quirks discover this via PCI config registers 
> > > > > > > > > like
> > > > > > > > > I830_ESMRAMC, I845_ESMRAMC, etc, and tell the driver about it 
> > > > > > > > > via the
> > > > > > > > > global "intel_graphics_stolen_res"?
> > > > > > > >
> > > > > > > > We discover what is called the graphics data stolen memory. It 
> > > > > > > > is regular
> > > > > > > > system memory range that is not CPU accessible. It is 
> > > > > > > > accessible by the
> > > > > > > > integrated graphics only.
> > > > > > > >
> > > > > > > > See: 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/kernel/early-quirks.c?h=v5.10-rc2=814c5f1f52a4beb3710317022acd6ad34fc0b6b9
> > > > > > > >
> > > > > > > > > That's not the way this should work.  There should some 
> > > > > > > > > generic, non
> > > > > > > > > device-dependent PCI or ACPI method to discover the memory 
> > > > > > > > > used, or at
> > > > > > > > > least some way to do it in the driver instead of early arch 
> > > > > > > > > code.
> > > > > > > >
> > > > > > > > It's used by the early BIOS/UEFI code to set up initial 
> > > > > > > > framebuffer.
> > > > > > > > Even if i915 driver is never loaded, the memory ranges still 
> > > > > > > > need to
> > > > > > > > be fixed. They source of the problem is that the OEM BIOS which 
> > > > > > > > are
> > > > > > > > not under our control get the programming wrong.
> > > > > > > >
> > > > > > > > We used to detect the memory region size again at i915 
> > > > > > > > initialization
> > > > > > > > but wanted to eliminate the code duplication and resulting 
> > > > > > > > subtle bugs
> > > > > > > > that caused. Conclusion back then was that storing the struct 
> > > > > > > > resource
> > > > > > > > in memory is the best trade-off.
> > > > > > > >
> > > > > > > > > How is this *supposed* to work?  Is there something we can do 
> > > > > > > > > in E820
> > > > > > > > > or other resource management that would make this easier?
> > > > > > > >
> > > > > > > > The code was added around Haswell (HSW) device generation to 
> > > > > > > > mitigate
> > > > > > > > bugs in BIOS. It is traditionally hard to get all OEMs to fix 
> > > > > > > > their
> > > > > > > > BIOS when things work for Windows. It's only later years when 
> > > > > > > > some
> > > > > > > > laptop models are intended to be sold with Linux.
> > > > > > > >
> > > > > > > > The alternative would be to get all the OEM to fix their BIOS 
> > > > > > > > for Linux,
> > > > > > > > but that is not very realistic given past experiences. So it 
> > > > > > > > seems
> > > > > > > > a better choice to to add new line per platform generation to 
> > > > > > > > make
> > > > > > > > sure the users can boot to Linux.
> > > > > > >
> > > > > > > How does Windows do this?  Do they have to add similar code for 
> > > > > > > each
> > > > > > > new platform?
> > > > > >
> > > > > > Windows is chicken and doesn't move any mmio bar around on its own.
> > > > > > Except if the bios explicitly told it somehow (e.g. for the 

Re: [PATCH] x86/pci: fix intel_mid_pci.c build error when ACPI is not enabled

2020-08-20 Thread Jesse Barnes
On Wed, Aug 19, 2020 at 9:08 PM Randy Dunlap  wrote:
>
> On 8/13/20 1:55 PM, Andy Shevchenko wrote:
> > On Thu, Aug 13, 2020 at 11:31 PM Arjan van de Ven  
> > wrote:
> >> On 8/13/2020 12:58 PM, Randy Dunlap wrote:
> >>> From: Randy Dunlap 
> >>>
> >>> Fix build error when CONFIG_ACPI is not set/enabled by adding
> >>> the header file  which contains a stub for the function
> >>> in the build error.
> >>>
> >>> ../arch/x86/pci/intel_mid_pci.c: In function ‘intel_mid_pci_init’:
> >>> ../arch/x86/pci/intel_mid_pci.c:303:2: error: implicit declaration of 
> >>> function ‘acpi_noirq_set’; did you mean ‘acpi_irq_get’? 
> >>> [-Werror=implicit-function-declaration]
> >>>acpi_noirq_set();
> >
> > Reviewed-by: Andy Shevchenko 
> > Thanks!
>
> also:
> Reviewed-by: Jesse Barnes 
>
> >
> >>> Signed-off-by: Randy Dunlap 
> >>> Cc: Jacob Pan 
> >>> Cc: Len Brown 
> >>> Cc: Bjorn Helgaas 
> >>> Cc: Jesse Barnes 
> >>> Cc: Arjan van de Ven 
> >>> Cc: linux-...@vger.kernel.org
> >>> ---
> >>> Found in linux-next, but applies to/exists in mainline also.
> >>>
> >>> Alternative.1: X86_INTEL_MID depends on ACPI
> >>> Alternative.2: drop X86_INTEL_MID support
> >>
> >> at this point I'd suggest Alternative 2; the products that needed that 
> >> (past tense, that technology
> >> is no longer need for any newer products) never shipped in any form where 
> >> a 4.x or 5.x kernel could
> >> work, and they are also all locked down...
> >
> > This is not true. We have Intel Edison which runs nicely on vanilla
> > (not everything, some is still requiring a couple of patches, but most
> > of it works out-of-the-box).
> >
> > And for the record, I have been working on removing quite a pile of
> > code (~13kLOCs to the date IIRC) in MID area. Just need some time to
> > fix Edison watchdog for that.
>
>
> I didn't see a consensus on this patch, although Andy says it's still needed,
> so it shouldn't be removed (yet). Maybe his big removal patch can remove it
> later. For now can we just fix the build error?


Yeah I think it makes sense to land it.  Doesn't get in the way of a
future removal and fixes a build error in the meantime.

Jesse


Re: [PATCH] x86/pci: fix intel_mid_pci.c build error when ACPI is not enabled

2020-08-13 Thread Jesse Barnes
On Thu, Aug 13, 2020 at 12:58 PM Randy Dunlap  wrote:
>
> From: Randy Dunlap 
>
> Fix build error when CONFIG_ACPI is not set/enabled by adding
> the header file  which contains a stub for the function
> in the build error.
>
> ../arch/x86/pci/intel_mid_pci.c: In function ‘intel_mid_pci_init’:
> ../arch/x86/pci/intel_mid_pci.c:303:2: error: implicit declaration of 
> function ‘acpi_noirq_set’; did you mean ‘acpi_irq_get’? 
> [-Werror=implicit-function-declaration]
>   acpi_noirq_set();
>
> Signed-off-by: Randy Dunlap 
> Cc: Jacob Pan 
> Cc: Len Brown 
> Cc: Bjorn Helgaas 
> Cc: Jesse Barnes 
> Cc: Arjan van de Ven 
> Cc: linux-...@vger.kernel.org
> ---
> Found in linux-next, but applies to/exists in mainline also.
>
> Alternative.1: X86_INTEL_MID depends on ACPI
> Alternative.2: drop X86_INTEL_MID support
>
>  arch/x86/pci/intel_mid_pci.c |1 +
>  1 file changed, 1 insertion(+)
>
> --- linux-next-20200813.orig/arch/x86/pci/intel_mid_pci.c
> +++ linux-next-20200813/arch/x86/pci/intel_mid_pci.c
> @@ -33,6 +33,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #define PCIE_CAP_OFFSET0x100

Reviewed-by: Jesse Barnes 

Thanks Randy. Another option is to remove the MID support entirely.

Jesse


Re: [RFC] Restrict the untrusted devices, to bind to only a set of "whitelisted" drivers

2020-06-08 Thread Jesse Barnes
> > I think your suggestion to disable driver binding once the initial
> > bus/slot devices have been bound will probably work for this
> > situation.  I just wanted to be clear that without some auditing,
> > fuzzing, and additional testing, we simply have to assume that drivers
> > are *not* secure and avoid using them on untrusted devices until we're
> > fairly confident they can handle them (whether just misbehaving or
> > malicious), in combination with other approaches like IOMMUs of
> > course.  And this isn't because we don't trust driver authors or
> > kernel developers to dtrt, it's just that for many devices (maybe USB
> > is an exception) I think driver authors haven't had to consider this
> > case much, and so I think it's prudent to expect bugs in this area
> > that we need to find & fix.
>
> For USB, yes, we have now had to deal with "untrusted devices" lieing
> about their ids and sending us horrible data.  That's all due to the
> fuzzing tools that have been written over the past few years, and now we
> have some of those in the kernel tree itself to help with that testing.
>
> For PCI, heh, good luck, those assumptions about "devices sending valid
> data" are everywhere, if our experience with USB is any indication.
>
> But, to take USB as an example, this is exactly what the USB
> "authorized" flag is there for, it's a "trust" setting that userspace
> has control over.  This came from the wireless USB spec, where it was
> determined that you could not trust devices.  So just use that same
> model here, move it to the driver core for all busses to use and you
> should be fine.
>
> If that doesn't meet your needs, please let me know the specifics of
> why, with patches :)

Yeah will do for sure.  I don't want to carry a big infra for this on our own!

> Now, as to you all getting some sort of "Hardware flag" to determine
> "inside" vs. "outside" devices, hah, good luck!  It took us a long time
> to get that for USB, and even then, BIOSes lie and get it wrong all the
> time.  So you will have to also deal with that in some way, for your
> userspace policy.

I think that's inherently platform specific to some extent.  We can do
it with our coreboot based firmware, but there's no guarantee other
vendors will adopt the same approach.  But I think at least for the
ChromeOS ecosystem we can come up with something that'll work, and
allow us to dtrt in userspace wrt driver binding.

Thanks,
Jesse


Re: [RFC] Restrict the untrusted devices, to bind to only a set of "whitelisted" drivers

2020-06-08 Thread Jesse Barnes
> > I feel a lot of resistance to the proposal, however, I'm not hearing
> > any realistic solutions that may help us to move forward. We want to
> > go with a solution that is acceptable upstream as that is our mission,
> > and also helps the community, however the behemoth task of "inspect
> > all drivers and fix them" before launching a product is really an
> > unfair ask I feel :-(. Can you help us by suggesting a proposal that
> > does not require us to trust a driver equally for internal / external
> > devices?
>
> I have no idea why you feel you have to "inspect all drivers" other than
> the fact that for some reason _you_ do not feel they are secure today.
>
> What type of "assurance" are you, or anyone else going to be able to
> provide for any kernel driver that would meet such a "I feel good now"
> level?  Have you done that work for any specific driver already so that
> you can show us what you mean by this effort?  Perhaps it's as simple as
> "oh look, this tool over here runs 'clean' on the source code, all is
> good!", or not, I really have no idea.

I think there's a disconnect somewhere in this discussion... maybe
we're just approaching this with different assumptions?

I think you recognize the potential for driver vulnerabilities when
binding to new or potentially hostile devices that may be spoofing
DID/VID/class/etc than then go on to abuse driver trust or the driver
using unvalidated inputs from the device to crash or run arbitrary
code on the target system.

Yes such drivers should be fixed, no doubt.  But without lots of
fuzzing (we're working on this) and testing we'd like to avoid
exposing that attack surface at all.

I think your suggestion to disable driver binding once the initial
bus/slot devices have been bound will probably work for this
situation.  I just wanted to be clear that without some auditing,
fuzzing, and additional testing, we simply have to assume that drivers
are *not* secure and avoid using them on untrusted devices until we're
fairly confident they can handle them (whether just misbehaving or
malicious), in combination with other approaches like IOMMUs of
course.  And this isn't because we don't trust driver authors or
kernel developers to dtrt, it's just that for many devices (maybe USB
is an exception) I think driver authors haven't had to consider this
case much, and so I think it's prudent to expect bugs in this area
that we need to find & fix.

Thanks,
Jesse


Re: [PATCH RFC] sched: Add a per-thread core scheduling interface

2020-05-21 Thread Jesse Barnes
On Thu, May 21, 2020 at 1:45 PM Joel Fernandes  wrote:
>
> Hi Linus,
>
> On Thu, May 21, 2020 at 11:31:38AM -0700, Linus Torvalds wrote:
> > On Wed, May 20, 2020 at 3:26 PM Joel Fernandes (Google)
> >  wrote:
> > Generally throughput benchmarks are much easier to do, how do you do
> > this latency benchmark, and is it perhaps something that could be run
> > more widely (ie I'm thinking that if it's generic enough and stable
> > enough to be run by some of the performance regression checking
> > robots, it would be a much more interesting test-case than some of the
> > ones they run right now...)
>
> Glad you like it! The metric is calculated with a timestamp of when the
> driver says the key was pressed, up until when the GPU says we've drawn
> pixels in response.
>
> The test requires a mostly only requires Chrome browser. It opens some
> pre-existing test URLs (a google doc, a window that opens a camera stream and
> another window that decodes video). This metric is already calculated in
> Chrome, we just scrape it from
> chrome://histograms/Event.Latency.EndToEnd.KeyPress.  If you install Chrome,
> you can goto this link and see the histogram.  We open a Google docs window
> and synthetically input keys into it with a camera stream and video decoding
> running in other windows which gives the CPUs a good beating. Then we collect
> roughly the 90th percentile keypress latency from the above histogram and the
> camera and decoded video's FPS, among other things. There is a test in the
> works that my colleagues are writing to run the full Google hangout video
> chatting stack to stress the system more (versus just the camera stream).  I
> guess if the robots can somehow input keys into the Google docs and open the
> right windows, then it is just a matter of scraping the histogram.

Expanding on this a little, we're working on a couple of projects that
should provide results like these for upstream.  One is continuously
rebasing our upstream backlog onto new kernels for testing purposes
(the idea here is to make it easier for us to update kernels on
Chromebooks), and the second is to drive more stuff into the
kernelci.org infrastructure.  Given the test environments we have in
place now, we can probably get results from our continuous rebase
project first and provide those against -rc releases if that's
something you'd be interested in.  Going forward, I hope we can
extract several of our tests and put them into kernelci as well, so we
get more general coverage without the potential impact of our (still
somewhat large) upstream backlog of patches.

To Joel's point, there are a few changes we'll have to make to get
similar results outside of our environment, but I think that's doable
without a ton of work.  And if anyone is curious, I think most of this
stuff is already public in the tast and autotest repos of the
chromiumos tree.  Just let us know if you want to make changes or port
to another environment so we can try to stay in sync wrt new features,
etc.

Thanks,
Jesse


Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-23 Thread Jesse Barnes
On 02/23/2016 08:19 AM, Simon Guinot wrote:
> On Mon, Feb 22, 2016 at 12:46:09PM -0800, Jesse Barnes wrote:
>> On 02/22/2016 05:49 AM, Alan Cox wrote:
>>>> we have some good alternatives in the form of bus and platform
>>>> drivers that
>>>> can manage the appropriate serialization and keep things from
>>>> stomping
>>>> on one another.
>>>
>>> It's not used much, especially nowdays. The use case is basically multi
>>> I/O chips on the ISA/LPC bus with magic shared config register ports.
>>>
>>> We have sufficiently few of those we could give muxed the boot and
>>> special case them if preferred.
>>
>> Ah that's right, now I remember the context.  So where should we go from 
>> here then?  Just leave the ugly fix in or hack on old stuff and hope not to 
>> break it?
> 
> Hi Jesse,
> 
> The fix is not ugly but only incomplete. And I have to say that the
> whole IORESOURCE_MUXED thing is not shiny either :)

Yeah definitely, I'm not casting stones at you, this whole problem is an
ugly one. :)

As Alan said, muxed is really intended for a pretty limited set of
cases.  The "right" solution is a lot more work than its worth, so we
have this instead, which is fine.  But obviously it's been a little
trickier to put in place than we hoped. :)

> The main problem in __request_region() is that we are dropping the
> spinlock of the resource list while holding a reference on a resource,
> waiting for a muxed resource to become available.
> 
> From here, I can see two bugs:
> 
> 1 - At wake-up, the next __request_resource() iteration will not detect
> a remaining conflict. To work properly, __request_resource() needs
> to be called with a parent of the conflicting resource. Instead we
> are passing the conflicting resource itself...
> 2 - At wake-up the resource pointer we are holding could have been
> freed. Since we are just about to use this pointer to insert a new
> resource in the linked list, it is broken...
> 
> My patch fixes 1 and makes things better for 2.
> 
> But I think Linus is right. If at wake-up we use the original parent
> resource to check again for a conflict, then we are safe.
> 
> If you want, I can propose a such patch.
> 
> Note that IORESOURCE_MUXED is mostly used by Super-I/Os drivers. A
> Super-I/O is a legacy I/O controller embedded on x86 motherboards. It is
> used to connect the low-bandwidth devices such as parallel ports, serial
> ports, keyboard controllers, hardware monitoring controllers, GPIO
> controllers, etc. While each logical device have its own memory region,
> a shared memory region is used for some configuration purpose.
> IORESOURCE_MUXED is a convenient way to deal with that. For some code
> examples you can look at the superio_* functions in the IT87 drivers:
> gpio/gpio-it87.c, hwmon/it87.c and watchdog/it87_wdt.c.
> 
> I am not aware of any other users for IORESOURCE_MUXED.
> 
> Let me know how you want to go and if you need my help.

I'd be happy if you proposed a patch.  It would also be nice if we could
somehow more formally limit the MUXED usage to these super I/O devices,
just so other users don't get into trouble thinking it's more general
than it is.

Thanks,
Jesse


Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-23 Thread Jesse Barnes
On 02/23/2016 08:19 AM, Simon Guinot wrote:
> On Mon, Feb 22, 2016 at 12:46:09PM -0800, Jesse Barnes wrote:
>> On 02/22/2016 05:49 AM, Alan Cox wrote:
>>>> we have some good alternatives in the form of bus and platform
>>>> drivers that
>>>> can manage the appropriate serialization and keep things from
>>>> stomping
>>>> on one another.
>>>
>>> It's not used much, especially nowdays. The use case is basically multi
>>> I/O chips on the ISA/LPC bus with magic shared config register ports.
>>>
>>> We have sufficiently few of those we could give muxed the boot and
>>> special case them if preferred.
>>
>> Ah that's right, now I remember the context.  So where should we go from 
>> here then?  Just leave the ugly fix in or hack on old stuff and hope not to 
>> break it?
> 
> Hi Jesse,
> 
> The fix is not ugly but only incomplete. And I have to say that the
> whole IORESOURCE_MUXED thing is not shiny either :)

Yeah definitely, I'm not casting stones at you, this whole problem is an
ugly one. :)

As Alan said, muxed is really intended for a pretty limited set of
cases.  The "right" solution is a lot more work than its worth, so we
have this instead, which is fine.  But obviously it's been a little
trickier to put in place than we hoped. :)

> The main problem in __request_region() is that we are dropping the
> spinlock of the resource list while holding a reference on a resource,
> waiting for a muxed resource to become available.
> 
> From here, I can see two bugs:
> 
> 1 - At wake-up, the next __request_resource() iteration will not detect
> a remaining conflict. To work properly, __request_resource() needs
> to be called with a parent of the conflicting resource. Instead we
> are passing the conflicting resource itself...
> 2 - At wake-up the resource pointer we are holding could have been
> freed. Since we are just about to use this pointer to insert a new
> resource in the linked list, it is broken...
> 
> My patch fixes 1 and makes things better for 2.
> 
> But I think Linus is right. If at wake-up we use the original parent
> resource to check again for a conflict, then we are safe.
> 
> If you want, I can propose a such patch.
> 
> Note that IORESOURCE_MUXED is mostly used by Super-I/Os drivers. A
> Super-I/O is a legacy I/O controller embedded on x86 motherboards. It is
> used to connect the low-bandwidth devices such as parallel ports, serial
> ports, keyboard controllers, hardware monitoring controllers, GPIO
> controllers, etc. While each logical device have its own memory region,
> a shared memory region is used for some configuration purpose.
> IORESOURCE_MUXED is a convenient way to deal with that. For some code
> examples you can look at the superio_* functions in the IT87 drivers:
> gpio/gpio-it87.c, hwmon/it87.c and watchdog/it87_wdt.c.
> 
> I am not aware of any other users for IORESOURCE_MUXED.
> 
> Let me know how you want to go and if you need my help.

I'd be happy if you proposed a patch.  It would also be nice if we could
somehow more formally limit the MUXED usage to these super I/O devices,
just so other users don't get into trouble thinking it's more general
than it is.

Thanks,
Jesse


Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-22 Thread Jesse Barnes
On 02/22/2016 05:49 AM, Alan Cox wrote:
>> we have some good alternatives in the form of bus and platform
>> drivers that
>> can manage the appropriate serialization and keep things from
>> stomping
>> on one another.
> 
> It's not used much, especially nowdays. The use case is basically multi
> I/O chips on the ISA/LPC bus with magic shared config register ports.
> 
> We have sufficiently few of those we could give muxed the boot and
> special case them if preferred.

Ah that's right, now I remember the context.  So where should we go from here 
then?  Just leave the ugly fix in or hack on old stuff and hope not to break it?

Jesse



Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-22 Thread Jesse Barnes
On 02/22/2016 05:49 AM, Alan Cox wrote:
>> we have some good alternatives in the form of bus and platform
>> drivers that
>> can manage the appropriate serialization and keep things from
>> stomping
>> on one another.
> 
> It's not used much, especially nowdays. The use case is basically multi
> I/O chips on the ISA/LPC bus with magic shared config register ports.
> 
> We have sufficiently few of those we could give muxed the boot and
> special case them if preferred.

Ah that's right, now I remember the context.  So where should we go from here 
then?  Just leave the ugly fix in or hack on old stuff and hope not to break it?

Jesse



Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-20 Thread Jesse Barnes
On February 20, 2016 9:12:01 AM Linus Torvalds 
<torva...@linux-foundation.org> wrote:



On Fri, Feb 19, 2016 at 3:25 PM, Jesse Barnes <jbar...@virtuousgeek.org> wrote:

+Linus (the de-facto resource guy).

On 02/19/2016 01:10 PM, Vincent Pelletier wrote:

Tested-by: Vincent Pelletier <plr.vinc...@gmail.com>


Hmm.

So I'm not entirely happy with the patch, because I think the problem
with using a possibly free'd parent resource at restart still exists.

As far as I can tell, if we hit the IORESOURCE_MUXED case *after* we
have successfully delved into a resource that wasn't busy, we will
have updated "parent" in a previous iteration of the loop, and we'll
not use the original parent when we then re-start after the sleep. So
quite frankly, I suspect any user of MUXED memory regions is still
fundamentally buggy, and IORESOURCE_MUXED has always been a hacky and
broken thing.

That said, I ended up applying the patch anyway, even if I despise it.
For all I know, muxed users never end up having those non-busy
sub-resources in practice, and maybe there is some serialization at
the top level for the drivers that use it. So if testing has shown
that it helps some actual case, I'll believe the testing. But the code
still looks rather debatable, and the whole IORESOURCE_MUXED approach
looks broken.

Jesse, that came in through you and the drm tree, I think. Alan is
marked as author, and there are other people who actually use and can
test the code. Can you guys think about the code a bit more.

I'm wondering if the *real* fix ends up being to reset the 'parent'
pointer to the original top-level parent after the sleep?

To recap: the patch is in my tree, but I'm not all that happy about it.


Thanks, yeah i think testing wins in this case.  I'll revisit the muxed 
stuff; I do

remember being dubious at the time, but iirc Alan needed it for something,
and others had been pushing for these sorts of usages for awhile even though
we have some good alternatives in the form of bus and platform drivers that
can manage the appropriate serialization and keep things from stomping
on one another.

(And sorry if this message comes across in some bullshit format, I'm trying
out a new ChromeOS based mail client for fun here.)

Thanks,
Jesse




Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-20 Thread Jesse Barnes
On February 20, 2016 9:12:01 AM Linus Torvalds 
 wrote:



On Fri, Feb 19, 2016 at 3:25 PM, Jesse Barnes  wrote:

+Linus (the de-facto resource guy).

On 02/19/2016 01:10 PM, Vincent Pelletier wrote:

Tested-by: Vincent Pelletier 


Hmm.

So I'm not entirely happy with the patch, because I think the problem
with using a possibly free'd parent resource at restart still exists.

As far as I can tell, if we hit the IORESOURCE_MUXED case *after* we
have successfully delved into a resource that wasn't busy, we will
have updated "parent" in a previous iteration of the loop, and we'll
not use the original parent when we then re-start after the sleep. So
quite frankly, I suspect any user of MUXED memory regions is still
fundamentally buggy, and IORESOURCE_MUXED has always been a hacky and
broken thing.

That said, I ended up applying the patch anyway, even if I despise it.
For all I know, muxed users never end up having those non-busy
sub-resources in practice, and maybe there is some serialization at
the top level for the drivers that use it. So if testing has shown
that it helps some actual case, I'll believe the testing. But the code
still looks rather debatable, and the whole IORESOURCE_MUXED approach
looks broken.

Jesse, that came in through you and the drm tree, I think. Alan is
marked as author, and there are other people who actually use and can
test the code. Can you guys think about the code a bit more.

I'm wondering if the *real* fix ends up being to reset the 'parent'
pointer to the original top-level parent after the sleep?

To recap: the patch is in my tree, but I'm not all that happy about it.


Thanks, yeah i think testing wins in this case.  I'll revisit the muxed 
stuff; I do

remember being dubious at the time, but iirc Alan needed it for something,
and others had been pushing for these sorts of usages for awhile even though
we have some good alternatives in the form of bus and platform drivers that
can manage the appropriate serialization and keep things from stomping
on one another.

(And sorry if this message comes across in some bullshit format, I'm trying
out a new ChromeOS based mail client for fun here.)

Thanks,
Jesse




Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-19 Thread Jesse Barnes
+Linus (the de-facto resource guy).

On 02/19/2016 01:10 PM, Vincent Pelletier wrote:
> Hello,
> 
> I finally got around to rebasing some patches, and realised that the
> patch from Simon Guinot below still gets rebased over torvalds' v4.4 .
> 
> Any reason it was not applied ?
> Or was the issue fixed in another, non-git-conflicting way ? (I see
> nothing recent in git log kernel/resource.c)
> 
> I do not find a trace of a mail confirming that I tested it and that it
> fixes the issue. So here goes:
> Tested-by: Vincent Pelletier 
> 
> Testing details: bug reproduced on 4.1, patch applied over 4.1 and bug
> disappeared. After rebasing this patch (along with others) over 4.4,
> bug does not reappear. I did not try to reproduce bug with 4.4, but if
> preferred I can give it a go.
> 
> On Thu, 10 Sep 2015 00:15:18 +0200, Simon Guinot
>  wrote:
>> In __request_region, if a conflict with a BUSY and MUXED resource is
>> detected, then the caller goes to sleep and waits for the resource to
>> be released. A pointer on the conflicting resource is kept. At wake-up
>> this pointer is used as a parent to retry to request the region. A first
>> problem is that this pointer might well be invalid (if for example the
>> conflicting resource have already been freed). An another problem is
>> that the next call to __request_region() fails to detect a remaining
>> conflict. The previously conflicting resource is passed as a parameter
>> and __request_region() will look for a conflict among the children of
>> this resource and not at the resource itself. It is likely to succeed
>> anyway, even if there is still a conflict. Instead, the parent of the
>> conflicting resource should be passed to __request_region().
>>
>> As a fix attempt, this patch don't update the parent resource pointer in
>> the case we have to wait for a muxed region right after.
>>
>> Reported-by: Vincent Pelletier 
>> Signed-off-by: Simon Guinot 
>> Tested-by: Vincent Donnefort 
>> ---
>>  kernel/resource.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index fed052a1bc9f..b8c84804db6a 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -1072,9 +1072,10 @@ struct resource * __request_region(struct resource 
>> *parent,
>>  if (!conflict)
>>  break;
>>  if (conflict != parent) {
>> -parent = conflict;
>> -if (!(conflict->flags & IORESOURCE_BUSY))
>> +if (!(conflict->flags & IORESOURCE_BUSY)) {
>> +parent = conflict;
>>  continue;
>> +}
>>  }
>>  if (conflict->flags & flags & IORESOURCE_MUXED) {
>>  add_wait_queue(_resource_wait, );
> 
> Regards,
> 



Re: [PATCH] kernel/resource.c: fix muxed resource handling in __request_region()

2016-02-19 Thread Jesse Barnes
+Linus (the de-facto resource guy).

On 02/19/2016 01:10 PM, Vincent Pelletier wrote:
> Hello,
> 
> I finally got around to rebasing some patches, and realised that the
> patch from Simon Guinot below still gets rebased over torvalds' v4.4 .
> 
> Any reason it was not applied ?
> Or was the issue fixed in another, non-git-conflicting way ? (I see
> nothing recent in git log kernel/resource.c)
> 
> I do not find a trace of a mail confirming that I tested it and that it
> fixes the issue. So here goes:
> Tested-by: Vincent Pelletier 
> 
> Testing details: bug reproduced on 4.1, patch applied over 4.1 and bug
> disappeared. After rebasing this patch (along with others) over 4.4,
> bug does not reappear. I did not try to reproduce bug with 4.4, but if
> preferred I can give it a go.
> 
> On Thu, 10 Sep 2015 00:15:18 +0200, Simon Guinot
>  wrote:
>> In __request_region, if a conflict with a BUSY and MUXED resource is
>> detected, then the caller goes to sleep and waits for the resource to
>> be released. A pointer on the conflicting resource is kept. At wake-up
>> this pointer is used as a parent to retry to request the region. A first
>> problem is that this pointer might well be invalid (if for example the
>> conflicting resource have already been freed). An another problem is
>> that the next call to __request_region() fails to detect a remaining
>> conflict. The previously conflicting resource is passed as a parameter
>> and __request_region() will look for a conflict among the children of
>> this resource and not at the resource itself. It is likely to succeed
>> anyway, even if there is still a conflict. Instead, the parent of the
>> conflicting resource should be passed to __request_region().
>>
>> As a fix attempt, this patch don't update the parent resource pointer in
>> the case we have to wait for a muxed region right after.
>>
>> Reported-by: Vincent Pelletier 
>> Signed-off-by: Simon Guinot 
>> Tested-by: Vincent Donnefort 
>> ---
>>  kernel/resource.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index fed052a1bc9f..b8c84804db6a 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -1072,9 +1072,10 @@ struct resource * __request_region(struct resource 
>> *parent,
>>  if (!conflict)
>>  break;
>>  if (conflict != parent) {
>> -parent = conflict;
>> -if (!(conflict->flags & IORESOURCE_BUSY))
>> +if (!(conflict->flags & IORESOURCE_BUSY)) {
>> +parent = conflict;
>>  continue;
>> +}
>>  }
>>  if (conflict->flags & flags & IORESOURCE_MUXED) {
>>  add_wait_queue(_resource_wait, );
> 
> Regards,
> 



Re: [PATCH 0/4 v2] Implement access checks in iommu page fault paths

2015-11-18 Thread Jesse Barnes
On 11/17/2015 07:11 AM, Joerg Roedel wrote:
> Hi,
> 
> here is the second version of the patch-set to implement
> proper access checks into the io-page-fault handlers of the
> AMD IOMMU and Intel VT-d drivers.
> 
> Two additional patches clean up the AMD part a bit further.
> Since I can't test this this code myself due to lack of
> hardware or software that utilizes it, I'd appreciate some
> external testing.
> 
> It took me a while to get these out, mostly because I tried
> to setup my own HSA test environment to at least test the
> AMD changes myself. That failed, so I am sending this out
> with another request for testing.
> 
> Oded, Jesse, would you two please test these patches and
> report back? Thanks a lot!

I might be able to find time later this week or next; I have to update
my tree to David's latest bits and rebase my i915 code on top along with
these patches.  Then I can run some of my tests and expand them to try
r/w tests w/o mapping the buffer with the right access mode.

Jesse

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4 v2] Implement access checks in iommu page fault paths

2015-11-18 Thread Jesse Barnes
On 11/17/2015 07:11 AM, Joerg Roedel wrote:
> Hi,
> 
> here is the second version of the patch-set to implement
> proper access checks into the io-page-fault handlers of the
> AMD IOMMU and Intel VT-d drivers.
> 
> Two additional patches clean up the AMD part a bit further.
> Since I can't test this this code myself due to lack of
> hardware or software that utilizes it, I'd appreciate some
> external testing.
> 
> It took me a while to get these out, mostly because I tried
> to setup my own HSA test environment to at least test the
> AMD changes myself. That failed, so I am sending this out
> with another request for testing.
> 
> Oded, Jesse, would you two please test these patches and
> report back? Thanks a lot!

I might be able to find time later this week or next; I have to update
my tree to David's latest bits and rebase my i915 code on top along with
these patches.  Then I can run some of my tests and expand them to try
r/w tests w/o mapping the buffer with the right access mode.

Jesse

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [git pull] drm for 4.3

2015-09-22 Thread Jesse Barnes
Cc'ing Maarten and Matt; I'm guessing this may be related to one of
their recent patches.

Jesse

On 09/21/2015 11:48 AM, Dave Jones wrote:
> On Mon, Sep 07, 2015 at 02:45:59PM -0400, Dave Jones wrote:
>  > On Fri, Sep 04, 2015 at 11:40:53PM +0100, Dave Airlie wrote:
>  >  > 
>  >  > Hi Linus,
>  >  > 
>  >  > This is the main pull request for the drm for 4.3. Nouveau is probably 
> the biggest
>  >  > amount of changes in here, since it missed 4.2. Highlights below, along 
> with the usual
>  >  > bunch of fixes. There are a few minor conflicts with your tree but 
> nothing 
>  >  > you can't handle. All stuff outside drm should have applicable acks.
>  >  > 
>  >  > Highlights:
>  >  > 
>  >  > ...
>  >  > i915:
>  >  > Skylake support enabled by default
>  >  > legacy modesetting using atomic infrastructure
>  >  > Skylake fixes
>  >  > GEN9 workarounds
>  > 
>  > Since this merge, I'm seeing this twice during boot..
> 
> And still there in -rc2.  Several other people reported this too,
> and they also got no reponse.
> 
> I'll start bisecting when I get home tonight. It shouldn't be too hard,
> as 4.2 was fine.
> 
>   Dave
> 
>  > [ cut here ]
>  > WARNING: CPU: 0 PID: 6 at drivers/gpu/drm/i915/intel_display.c:1377 
> assert_planes_disabled+0xdf/0x140()
>  > plane A assertion failure, should be disabled but not
>  > CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.0-think+ #9
>  > Workqueue: events_unbound async_run_entry_fn
>  >  0561 88050392b6f8 8d7dccce 88050392b740
>  >  88050392b730 8d079ee2 880500a6 
>  >    8805008e99c8 88050392b790
>  > Call Trace:
>  >  [] dump_stack+0x4e/0x79
>  >  [] warn_slowpath_common+0x82/0xc0
>  >  [] warn_slowpath_fmt+0x4c/0x50
>  >  [] assert_planes_disabled+0xdf/0x140
>  >  [] intel_disable_pipe+0x4b/0x2c0
>  >  [] haswell_crtc_disable+0x8a/0x2e0
>  >  [] intel_atomic_commit+0xff/0x1320
>  >  [] ? drm_atomic_check_only+0x21e/0x550
>  >  [] drm_atomic_commit+0x37/0x60
>  >  [] drm_atomic_helper_set_config+0x1c5/0x430
>  >  [] drm_mode_set_config_internal+0x65/0x110
>  >  [] restore_fbdev_mode+0xbe/0xe0
>  >  [] drm_fb_helper_restore_fbdev_mode_unlocked+0x25/0x70
>  >  [] drm_fb_helper_set_par+0x2d/0x50
>  >  [] intel_fbdev_set_par+0x1a/0x60
>  >  [] fbcon_init+0x545/0x5d0
>  >  [] visual_init+0xca/0x130
>  >  [] do_bind_con_driver+0x1c5/0x3b0
>  >  [] do_take_over_console+0x149/0x1a0
>  >  [] do_fbcon_takeover+0x57/0xb0
>  >  [] fbcon_event_notify+0x66c/0x760
>  >  [] notifier_call_chain+0x3e/0xb0
>  >  [] __blocking_notifier_call_chain+0x4d/0x70
>  >  [] blocking_notifier_call_chain+0x16/0x20
>  >  [] fb_notifier_call_chain+0x1b/0x20
>  >  [] register_framebuffer+0x1e7/0x300
>  >  [] drm_fb_helper_initial_config+0x252/0x3e0
>  >  [] intel_fbdev_initial_config+0x1b/0x20
>  >  [] async_run_entry_fn+0x4a/0x140
>  >  [] process_one_work+0x1fd/0x670
>  >  [] ? process_one_work+0x16c/0x670
>  >  [] worker_thread+0x4e/0x450
>  >  [] ? process_one_work+0x670/0x670
>  >  [] kthread+0x101/0x120
>  >  [] ? kthread_create_on_node+0x250/0x250
>  >  [] ret_from_fork+0x3f/0x70
>  >  [] ? kthread_create_on_node+0x250/0x250
>  > ---[ end trace 54cab2e0c772d5d9 ]---
>  > 
>  > 
>  > 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3 
> Processor Integrated Graphics Controller (rev 06)
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Intel-gfx] [git pull] drm for 4.3

2015-09-22 Thread Jesse Barnes
Cc'ing Maarten and Matt; I'm guessing this may be related to one of
their recent patches.

Jesse

On 09/21/2015 11:48 AM, Dave Jones wrote:
> On Mon, Sep 07, 2015 at 02:45:59PM -0400, Dave Jones wrote:
>  > On Fri, Sep 04, 2015 at 11:40:53PM +0100, Dave Airlie wrote:
>  >  > 
>  >  > Hi Linus,
>  >  > 
>  >  > This is the main pull request for the drm for 4.3. Nouveau is probably 
> the biggest
>  >  > amount of changes in here, since it missed 4.2. Highlights below, along 
> with the usual
>  >  > bunch of fixes. There are a few minor conflicts with your tree but 
> nothing 
>  >  > you can't handle. All stuff outside drm should have applicable acks.
>  >  > 
>  >  > Highlights:
>  >  > 
>  >  > ...
>  >  > i915:
>  >  > Skylake support enabled by default
>  >  > legacy modesetting using atomic infrastructure
>  >  > Skylake fixes
>  >  > GEN9 workarounds
>  > 
>  > Since this merge, I'm seeing this twice during boot..
> 
> And still there in -rc2.  Several other people reported this too,
> and they also got no reponse.
> 
> I'll start bisecting when I get home tonight. It shouldn't be too hard,
> as 4.2 was fine.
> 
>   Dave
> 
>  > [ cut here ]
>  > WARNING: CPU: 0 PID: 6 at drivers/gpu/drm/i915/intel_display.c:1377 
> assert_planes_disabled+0xdf/0x140()
>  > plane A assertion failure, should be disabled but not
>  > CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.0-think+ #9
>  > Workqueue: events_unbound async_run_entry_fn
>  >  0561 88050392b6f8 8d7dccce 88050392b740
>  >  88050392b730 8d079ee2 880500a6 
>  >    8805008e99c8 88050392b790
>  > Call Trace:
>  >  [] dump_stack+0x4e/0x79
>  >  [] warn_slowpath_common+0x82/0xc0
>  >  [] warn_slowpath_fmt+0x4c/0x50
>  >  [] assert_planes_disabled+0xdf/0x140
>  >  [] intel_disable_pipe+0x4b/0x2c0
>  >  [] haswell_crtc_disable+0x8a/0x2e0
>  >  [] intel_atomic_commit+0xff/0x1320
>  >  [] ? drm_atomic_check_only+0x21e/0x550
>  >  [] drm_atomic_commit+0x37/0x60
>  >  [] drm_atomic_helper_set_config+0x1c5/0x430
>  >  [] drm_mode_set_config_internal+0x65/0x110
>  >  [] restore_fbdev_mode+0xbe/0xe0
>  >  [] drm_fb_helper_restore_fbdev_mode_unlocked+0x25/0x70
>  >  [] drm_fb_helper_set_par+0x2d/0x50
>  >  [] intel_fbdev_set_par+0x1a/0x60
>  >  [] fbcon_init+0x545/0x5d0
>  >  [] visual_init+0xca/0x130
>  >  [] do_bind_con_driver+0x1c5/0x3b0
>  >  [] do_take_over_console+0x149/0x1a0
>  >  [] do_fbcon_takeover+0x57/0xb0
>  >  [] fbcon_event_notify+0x66c/0x760
>  >  [] notifier_call_chain+0x3e/0xb0
>  >  [] __blocking_notifier_call_chain+0x4d/0x70
>  >  [] blocking_notifier_call_chain+0x16/0x20
>  >  [] fb_notifier_call_chain+0x1b/0x20
>  >  [] register_framebuffer+0x1e7/0x300
>  >  [] drm_fb_helper_initial_config+0x252/0x3e0
>  >  [] intel_fbdev_initial_config+0x1b/0x20
>  >  [] async_run_entry_fn+0x4a/0x140
>  >  [] process_one_work+0x1fd/0x670
>  >  [] ? process_one_work+0x16c/0x670
>  >  [] worker_thread+0x4e/0x450
>  >  [] ? process_one_work+0x670/0x670
>  >  [] kthread+0x101/0x120
>  >  [] ? kthread_create_on_node+0x250/0x250
>  >  [] ret_from_fork+0x3f/0x70
>  >  [] ? kthread_create_on_node+0x250/0x250
>  > ---[ end trace 54cab2e0c772d5d9 ]---
>  > 
>  > 
>  > 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3 
> Processor Integrated Graphics Controller (rev 06)
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH v4 3/3] vt: fix console lock vs. kernfs s_active lock order

2015-03-26 Thread Jesse Barnes
On 12/16/2014 09:42 AM, Daniel Vetter wrote:
> On Tue, Dec 16, 2014 at 6:15 PM, Peter Hurley  
> wrote:
>> On 12/16/2014 11:22 AM, Imre Deak wrote:
>>> On Tue, 2014-12-16 at 10:00 -0500, Peter Hurley wrote:
 Fine. Just another expedient fix piled on top of other expedient fixes
 that go back past 3.9 with no end in sight.
>>>
>>> I'm also happy to look into narrowing down the scope of console_lock in
>>> fbdev/fbcon as was suggested. But doing that as a follow-up to this
>>> change still makes sense to me since it will take more time and have the
>>> risk of regressions that are not related to what this change fixes.
>>
>> I apologize for my tone. I'm not blaming you for the current situation,
>> nor is it your responsibility to go fix vt/fbcon/fbdev driver stack
>> inversion. I'm just trying to bring some awareness of the larger scope,
>> so that collectively we take action and resolve the underlying problems.
> 
> Yeah I guess I should tune down my NACK to a Grumpy-if-merged-by too.
> We have a lot of nonoptimal solutions at hand here :(

So where does that leave us with this fix?  Should we wait for someone
to come along and do all the rework?  Imre said he'd be willing to do
it, but still feels this fix makes sense.

Or we could just abandon the fb layer altogether (my preference).  In
that case fixing this is fine, since we'll be able to ignore it for
configs that switch over to using !fbdev and kmscon.

Thanks,
Jesse

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [PATCH v4 3/3] vt: fix console lock vs. kernfs s_active lock order

2015-03-26 Thread Jesse Barnes
On 12/16/2014 09:42 AM, Daniel Vetter wrote:
 On Tue, Dec 16, 2014 at 6:15 PM, Peter Hurley pe...@hurleysoftware.com 
 wrote:
 On 12/16/2014 11:22 AM, Imre Deak wrote:
 On Tue, 2014-12-16 at 10:00 -0500, Peter Hurley wrote:
 Fine. Just another expedient fix piled on top of other expedient fixes
 that go back past 3.9 with no end in sight.

 I'm also happy to look into narrowing down the scope of console_lock in
 fbdev/fbcon as was suggested. But doing that as a follow-up to this
 change still makes sense to me since it will take more time and have the
 risk of regressions that are not related to what this change fixes.

 I apologize for my tone. I'm not blaming you for the current situation,
 nor is it your responsibility to go fix vt/fbcon/fbdev driver stack
 inversion. I'm just trying to bring some awareness of the larger scope,
 so that collectively we take action and resolve the underlying problems.
 
 Yeah I guess I should tune down my NACK to a Grumpy-if-merged-by too.
 We have a lot of nonoptimal solutions at hand here :(

So where does that leave us with this fix?  Should we wait for someone
to come along and do all the rework?  Imre said he'd be willing to do
it, but still feels this fix makes sense.

Or we could just abandon the fb layer altogether (my preference).  In
that case fixing this is fine, since we'll be able to ignore it for
configs that switch over to using !fbdev and kmscon.

Thanks,
Jesse

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iommu/amd: Fix amd_iommu_free_device()

2015-02-03 Thread Jesse Barnes
On Tue, 3 Feb 2015 13:25:51 +0100
Peter Zijlstra  wrote:

> On Tue, Feb 03, 2015 at 12:27:33PM +0100, Peter Zijlstra wrote:
> > drivers/iommu/amd_iommu_v2.c-static void
> > put_device_state_wait(struct device_state *dev_state)
> > drivers/iommu/amd_iommu_v2.c-{ drivers/iommu/amd_iommu_v2.c-
> > DEFINE_WAIT(wait); drivers/iommu/amd_iommu_v2.c-
> > drivers/iommu/amd_iommu_v2.c-   prepare_to_wait(_state->wq,
> > , TASK_UNINTERRUPTIBLE); drivers/iommu/amd_iommu_v2.c-   if
> > (!atomic_dec_and_test(_state->count))
> > drivers/iommu/amd_iommu_v2.c:   schedule();
> > drivers/iommu/amd_iommu_v2.c-   finish_wait(_state->wq, );
> > drivers/iommu/amd_iommu_v2.c- drivers/iommu/amd_iommu_v2.c-
> > free_device_state(dev_state); drivers/iommu/amd_iommu_v2.c-}
> > 
> > No loop...
> 
> Jesse, any objections to this?

None from me, seems reasonable.  But this is Joerg's code I think, so
he should ack.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iommu/amd: Fix amd_iommu_free_device()

2015-02-03 Thread Jesse Barnes
On Tue, 3 Feb 2015 13:25:51 +0100
Peter Zijlstra pet...@infradead.org wrote:

 On Tue, Feb 03, 2015 at 12:27:33PM +0100, Peter Zijlstra wrote:
  drivers/iommu/amd_iommu_v2.c-static void
  put_device_state_wait(struct device_state *dev_state)
  drivers/iommu/amd_iommu_v2.c-{ drivers/iommu/amd_iommu_v2.c-
  DEFINE_WAIT(wait); drivers/iommu/amd_iommu_v2.c-
  drivers/iommu/amd_iommu_v2.c-   prepare_to_wait(dev_state-wq,
  wait, TASK_UNINTERRUPTIBLE); drivers/iommu/amd_iommu_v2.c-   if
  (!atomic_dec_and_test(dev_state-count))
  drivers/iommu/amd_iommu_v2.c:   schedule();
  drivers/iommu/amd_iommu_v2.c-   finish_wait(dev_state-wq, wait);
  drivers/iommu/amd_iommu_v2.c- drivers/iommu/amd_iommu_v2.c-
  free_device_state(dev_state); drivers/iommu/amd_iommu_v2.c-}
  
  No loop...
 
 Jesse, any objections to this?

None from me, seems reasonable.  But this is Joerg's code I think, so
he should ack.

Jesse
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] iommu/amd: use handle_mm_fault directly v2

2015-01-26 Thread Jesse Barnes
On Sun, 25 Jan 2015 15:16:44 +0200
Oded Gabbay  wrote:

> 
> 
> On 11/13/2014 12:10 AM, Jesse Barnes wrote:
> > This could be useful for debug in the future if we want to track
> > major/minor faults more closely, and also avoids the put_page trick we
> > used with gup.
> >
> > In order to do this, we also track the task struct in the PASID state
> > structure.  This lets us update the appropriate task stats after the
> > fault has been handled, and may aid with debug in the future as well.
> >
> > v2: drop task accounting; GPU activity may have been submitted by a
> >  different thread than the one binding the PASID (Joerg)
> >
> > Tested-by: Oded Gabbay
> > Signed-off-by: Jesse Barnes
> 
> Hi Jesse,
> 
> I know I tested your patch a few months ago, but we have a new feature (still 
> internally) in the driver, which has some conflicts with this patch.
> 
> Our feature is basically doing "exception handling" by registering a callback 
> function with the iommu driver in inv_ppr_cb.
> 
> Now, with the old code (we used 3.17.2 until a few days ago), this callback 
> function was called in, at least, three use-cases (which we are testing):
> 
> (1) Writing to a "bad" system memory address, which is *not* in the process's 
> memory address space.
> 
> (2) Writing to a read-only page, which is inside the process's memory address 
> space
> 
> (3) Reading from a page without permissions, which is inside the process's 
> memory address space
> 
> With the new code (3.19-rc5), this callback is only called in the first 
> use-case, while (2) and (3) are handled in handle_mm_fault(), which is now 
> called from do_fault. The return value of handle_mm_fault() is 0, so 
> handle_fault_error() is not called and amdkfd doesn't get notification, hence 
> our test fails.
> 
> This is a problem for us as we want to propagate these exceptions to the user 
> space HSA runtime, so it could handle them.
> 
> I have 2 questions:
> 
> 1. Why don't we call inv_ppr_cb() in any case ?

We do if we fail to allocate the vma or it's in the wrong location, but
we could extend the do_fault() handling to do it in more cases.

> 2. How come handle_mm_fault() returns 0 in cases (2) and (3) ? Or in other 
> words, what is considered to be a success in handle_mm_fault() and is it 
> visible 
> to the user-space process ?

handle_mm_fault() is somewhat of a low level function.  We can catch
more cases in our own do_fault() code if we need to.   The x86
__do_page_fault is probably a good reference.  I mainly tried to match
existing behavior when I added the handle_mm_fault(), but may have
missed stuff.  As I said, we can extend our do_fault() to handle all
the cases we want prior to calling handle_mm_fault().

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] iommu/amd: use handle_mm_fault directly v2

2015-01-26 Thread Jesse Barnes
On Sun, 25 Jan 2015 15:16:44 +0200
Oded Gabbay oded.gab...@amd.com wrote:

 
 
 On 11/13/2014 12:10 AM, Jesse Barnes wrote:
  This could be useful for debug in the future if we want to track
  major/minor faults more closely, and also avoids the put_page trick we
  used with gup.
 
  In order to do this, we also track the task struct in the PASID state
  structure.  This lets us update the appropriate task stats after the
  fault has been handled, and may aid with debug in the future as well.
 
  v2: drop task accounting; GPU activity may have been submitted by a
   different thread than the one binding the PASID (Joerg)
 
  Tested-by: Oded Gabbayoded.gab...@amd.com
  Signed-off-by: Jesse Barnesjbar...@virtuousgeek.org
 
 Hi Jesse,
 
 I know I tested your patch a few months ago, but we have a new feature (still 
 internally) in the driver, which has some conflicts with this patch.
 
 Our feature is basically doing exception handling by registering a callback 
 function with the iommu driver in inv_ppr_cb.
 
 Now, with the old code (we used 3.17.2 until a few days ago), this callback 
 function was called in, at least, three use-cases (which we are testing):
 
 (1) Writing to a bad system memory address, which is *not* in the process's 
 memory address space.
 
 (2) Writing to a read-only page, which is inside the process's memory address 
 space
 
 (3) Reading from a page without permissions, which is inside the process's 
 memory address space
 
 With the new code (3.19-rc5), this callback is only called in the first 
 use-case, while (2) and (3) are handled in handle_mm_fault(), which is now 
 called from do_fault. The return value of handle_mm_fault() is 0, so 
 handle_fault_error() is not called and amdkfd doesn't get notification, hence 
 our test fails.
 
 This is a problem for us as we want to propagate these exceptions to the user 
 space HSA runtime, so it could handle them.
 
 I have 2 questions:
 
 1. Why don't we call inv_ppr_cb() in any case ?

We do if we fail to allocate the vma or it's in the wrong location, but
we could extend the do_fault() handling to do it in more cases.

 2. How come handle_mm_fault() returns 0 in cases (2) and (3) ? Or in other 
 words, what is considered to be a success in handle_mm_fault() and is it 
 visible 
 to the user-space process ?

handle_mm_fault() is somewhat of a low level function.  We can catch
more cases in our own do_fault() code if we need to.   The x86
__do_page_fault is probably a good reference.  I mainly tried to match
existing behavior when I added the handle_mm_fault(), but may have
missed stuff.  As I said, we can extend our do_fault() to handle all
the cases we want prior to calling handle_mm_fault().

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Special handling of display/VGA devices in hotplug drivers

2014-12-11 Thread Jesse Barnes
On Thu, 11 Dec 2014 13:11:36 -0500
Greg Kroah-Hartman  wrote:

> On Thu, Dec 11, 2014 at 10:34:30AM -0700, Bjorn Helgaas wrote:
> > It looks like you added the initial pciehp driver [1], which
> > includes the following code in pciehp_disable_slot():
> > 
> > + if (class_code == PCI_BASE_CLASS_DISPLAY) {
> > + /* Display/Video adapter (not supported) */
> > + rc = REMOVE_NOT_SUPPORTED;
> > 
> > + /* If it's a bridge, check the VGA Enable bit */
> > + if ((header_type & 0x7F) == PCI_HEADER_TYPE_BRIDGE) {
> > + rc = pci_bus_read_config_byte (pci_bus, devfn,
> > PCI_BRIDGE_CONTROL, );
> > + if (rc)
> > + return rc;
> > +
> > + /* If the VGA Enable bit is set, remove isn't supported */
> > + if (BCR & PCI_BRIDGE_CTL_VGA) {
> > + rc = REMOVE_NOT_SUPPORTED;
> > 
> > I'm trying to figure out why VGA devices are handled specially.  I
> > can't find anything in the PCI specs that mentions this.  Most of
> > the other PCI hotplug drivers have similar code.  Do you remember
> > anything about this?
> 
> The PCI spec said that you were not allowed to hotplug VGA drivers.
> The big issue is that POST usually needs to run on those things, and
> there is no way to POST a PCI hotplugged device.
> 
> Does the spec not say that anymore?  I haven't looked in years at
> it...
> 
> Do you want to hot-add a VGA device?  Are these lines causing a
> problem with something?

Yeah, the legacy I/O regions get routed through the bridge with the VGA
bit set, and most legacy code probably can't handle that (whether POST,
VBIOS, or VGA drivers).

There is some code for moving the VGA routing around, so that might be
an option if you wanted to remove such a bridge.  You'd have to find a
VGA device under another bridge, and enable routing to that first, then
you could do the remove.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Special handling of display/VGA devices in hotplug drivers

2014-12-11 Thread Jesse Barnes
On Thu, 11 Dec 2014 13:11:36 -0500
Greg Kroah-Hartman gre...@linuxfoundation.org wrote:

 On Thu, Dec 11, 2014 at 10:34:30AM -0700, Bjorn Helgaas wrote:
  It looks like you added the initial pciehp driver [1], which
  includes the following code in pciehp_disable_slot():
  
  + if (class_code == PCI_BASE_CLASS_DISPLAY) {
  + /* Display/Video adapter (not supported) */
  + rc = REMOVE_NOT_SUPPORTED;
  
  + /* If it's a bridge, check the VGA Enable bit */
  + if ((header_type  0x7F) == PCI_HEADER_TYPE_BRIDGE) {
  + rc = pci_bus_read_config_byte (pci_bus, devfn,
  PCI_BRIDGE_CONTROL, BCR);
  + if (rc)
  + return rc;
  +
  + /* If the VGA Enable bit is set, remove isn't supported */
  + if (BCR  PCI_BRIDGE_CTL_VGA) {
  + rc = REMOVE_NOT_SUPPORTED;
  
  I'm trying to figure out why VGA devices are handled specially.  I
  can't find anything in the PCI specs that mentions this.  Most of
  the other PCI hotplug drivers have similar code.  Do you remember
  anything about this?
 
 The PCI spec said that you were not allowed to hotplug VGA drivers.
 The big issue is that POST usually needs to run on those things, and
 there is no way to POST a PCI hotplugged device.
 
 Does the spec not say that anymore?  I haven't looked in years at
 it...
 
 Do you want to hot-add a VGA device?  Are these lines causing a
 problem with something?

Yeah, the legacy I/O regions get routed through the bridge with the VGA
bit set, and most legacy code probably can't handle that (whether POST,
VBIOS, or VGA drivers).

There is some code for moving the VGA routing around, so that might be
an option if you wanted to remove such a bridge.  You'd have to find a
VGA device under another bridge, and enable routing to that first, then
you could do the remove.

Jesse
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] iommu/amd: use handle_mm_fault directly v2

2014-11-12 Thread Jesse Barnes
This could be useful for debug in the future if we want to track
major/minor faults more closely, and also avoids the put_page trick we
used with gup.

In order to do this, we also track the task struct in the PASID state
structure.  This lets us update the appropriate task stats after the
fault has been handled, and may aid with debug in the future as well.

v2: drop task accounting; GPU activity may have been submitted by a
different thread than the one binding the PASID (Joerg)

Tested-by: Oded Gabbay 
Signed-off-by: Jesse Barnes 
---
 drivers/iommu/amd_iommu_v2.c | 84 
 1 file changed, 53 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 90d734b..9c0d6e2 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -513,45 +513,67 @@ static void finish_pri_tag(struct device_state *dev_state,
spin_unlock_irqrestore(_state->lock, flags);
 }
 
+static void handle_fault_error(struct fault *fault)
+{
+   int status;
+
+   if (!fault->dev_state->inv_ppr_cb) {
+   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   return;
+   }
+
+   status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
+ fault->pasid,
+ fault->address,
+ fault->flags);
+   switch (status) {
+   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
+   set_pri_tag_status(fault->state, fault->tag, PPR_SUCCESS);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_INVALID:
+   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_FAIL:
+   set_pri_tag_status(fault->state, fault->tag, PPR_FAILURE);
+   break;
+   default:
+   BUG();
+   }
+}
+
 static void do_fault(struct work_struct *work)
 {
struct fault *fault = container_of(work, struct fault, work);
-   int npages, write;
-   struct page *page;
+   struct mm_struct *mm;
+   struct vm_area_struct *vma;
+   u64 address;
+   int ret, write;
 
write = !!(fault->flags & PPR_FAULT_WRITE);
 
-   down_read(>state->mm->mmap_sem);
-   npages = get_user_pages(NULL, fault->state->mm,
-   fault->address, 1, write, 0, , NULL);
-   up_read(>state->mm->mmap_sem);
-
-   if (npages == 1) {
-   put_page(page);
-   } else if (fault->dev_state->inv_ppr_cb) {
-   int status;
-
-   status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
- fault->pasid,
- fault->address,
- fault->flags);
-   switch (status) {
-   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_SUCCESS);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_INVALID:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_INVALID);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_FAIL:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_FAILURE);
-   break;
-   default:
-   BUG();
-   }
-   } else {
-   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   mm = fault->state->mm;
+   address = fault->address;
+
+   down_read(>mmap_sem);
+   vma = find_extend_vma(mm, address);
+   if (!vma || address < vma->vm_start) {
+   /* failed to get a vma in the right range */
+   up_read(>mmap_sem);
+   handle_fault_error(fault);
+   goto out;
+   }
+
+   ret = handle_mm_fault(mm, vma, address, write);
+   if (ret & VM_FAULT_ERROR) {
+   /* failed to service fault */
+   up_read(>mmap_sem);
+   handle_fault_error(fault);
+   goto out;
}
 
+   up_read(>mmap_sem);
+
+out:
finish_pri_tag(fault->dev_state, fault->state, fault->tag);
 
put_pasid_state(fault->state);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-12 Thread Jesse Barnes
On Thu, 6 Nov 2014 14:01:22 +0100
Joerg Roedel  wrote:

> On Wed, Nov 05, 2014 at 01:51:09PM -0800, Jesse Barnes wrote:
> > On Wed, 5 Nov 2014 13:03:51 +0100
> > Joerg Roedel  wrote:
> > > Thanks for testing Oded. Jesse, the patch looks good to me, except the
> > > task accounting for the page-faults. I'd like to get rid of using
> > > task_struct in the IOMMUv2 driver entirely if possible. Also it is not
> > > really the CPU task causing the faults, but some non-CPU process.
> > 
> > Hm, but the CPU task initiates the activity on the GPU, so we should
> > account for it somewhere, right?  I guess I had been thinking of the
> > "task" as spanning the CPUs and GPUs and other devices in the system,
> > rather than just representing the CPU activity.
> 
> One problem is that the task that called amd_iommu_bind_pasid() isn't
> necessarily the same task (thread) that queues the jobs to the device.
> The thread that called amd_iommu_bind_pasid() might even exit while
> other threads still use the mappings.
> 
> Besides that, from an abstract point of view, what is running on the
> device (GPU) is a logically seperate 'thread' of the process which we
> should account for seperatly. If we want accounting for these off-CPU
> threads we should probably introduce some concept of a non-CPU task to
> the kernel and do the accounting there?

Yeah those are good points; I hadn't been thinking of multi-threaded
stuff.  Logically the GPU stuff really is a separate thread in that
sense, so monitoring faults separately makes sense.

I wonder if we need a new "device_task_struct" or
"coprocessor_task_struct" or something to wrap some simple stuff on
non-CPU devices.  They could be sub-classed by drivers that needed
additional driver specific data.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-12 Thread Jesse Barnes
This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
simply, rather than doing tricks with page refs and get_user_pages().

Signed-off-by: Jesse Barnes 
---
 mm/memory.c | 1 +
 mm/mmap.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 1cc6bfb..969ff0c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
return ret;
 }
+EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /*
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f85520..2ee7971 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
 }
 #endif
 
+EXPORT_SYMBOL_GPL(find_extend_vma);
+
 /*
  * Ok - we have the memory areas we should free on the vma list,
  * so release them, and do the vma updates.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-12 Thread Jesse Barnes
This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
simply, rather than doing tricks with page refs and get_user_pages().

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 mm/memory.c | 1 +
 mm/mmap.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 1cc6bfb..969ff0c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
return ret;
 }
+EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /*
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f85520..2ee7971 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
 }
 #endif
 
+EXPORT_SYMBOL_GPL(find_extend_vma);
+
 /*
  * Ok - we have the memory areas we should free on the vma list,
  * so release them, and do the vma updates.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-12 Thread Jesse Barnes
On Thu, 6 Nov 2014 14:01:22 +0100
Joerg Roedel jroe...@suse.de wrote:

 On Wed, Nov 05, 2014 at 01:51:09PM -0800, Jesse Barnes wrote:
  On Wed, 5 Nov 2014 13:03:51 +0100
  Joerg Roedel jroe...@suse.de wrote:
   Thanks for testing Oded. Jesse, the patch looks good to me, except the
   task accounting for the page-faults. I'd like to get rid of using
   task_struct in the IOMMUv2 driver entirely if possible. Also it is not
   really the CPU task causing the faults, but some non-CPU process.
  
  Hm, but the CPU task initiates the activity on the GPU, so we should
  account for it somewhere, right?  I guess I had been thinking of the
  task as spanning the CPUs and GPUs and other devices in the system,
  rather than just representing the CPU activity.
 
 One problem is that the task that called amd_iommu_bind_pasid() isn't
 necessarily the same task (thread) that queues the jobs to the device.
 The thread that called amd_iommu_bind_pasid() might even exit while
 other threads still use the mappings.
 
 Besides that, from an abstract point of view, what is running on the
 device (GPU) is a logically seperate 'thread' of the process which we
 should account for seperatly. If we want accounting for these off-CPU
 threads we should probably introduce some concept of a non-CPU task to
 the kernel and do the accounting there?

Yeah those are good points; I hadn't been thinking of multi-threaded
stuff.  Logically the GPU stuff really is a separate thread in that
sense, so monitoring faults separately makes sense.

I wonder if we need a new device_task_struct or
coprocessor_task_struct or something to wrap some simple stuff on
non-CPU devices.  They could be sub-classed by drivers that needed
additional driver specific data.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] iommu/amd: use handle_mm_fault directly v2

2014-11-12 Thread Jesse Barnes
This could be useful for debug in the future if we want to track
major/minor faults more closely, and also avoids the put_page trick we
used with gup.

In order to do this, we also track the task struct in the PASID state
structure.  This lets us update the appropriate task stats after the
fault has been handled, and may aid with debug in the future as well.

v2: drop task accounting; GPU activity may have been submitted by a
different thread than the one binding the PASID (Joerg)

Tested-by: Oded Gabbay oded.gab...@amd.com
Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/iommu/amd_iommu_v2.c | 84 
 1 file changed, 53 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 90d734b..9c0d6e2 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -513,45 +513,67 @@ static void finish_pri_tag(struct device_state *dev_state,
spin_unlock_irqrestore(pasid_state-lock, flags);
 }
 
+static void handle_fault_error(struct fault *fault)
+{
+   int status;
+
+   if (!fault-dev_state-inv_ppr_cb) {
+   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   return;
+   }
+
+   status = fault-dev_state-inv_ppr_cb(fault-dev_state-pdev,
+ fault-pasid,
+ fault-address,
+ fault-flags);
+   switch (status) {
+   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
+   set_pri_tag_status(fault-state, fault-tag, PPR_SUCCESS);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_INVALID:
+   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_FAIL:
+   set_pri_tag_status(fault-state, fault-tag, PPR_FAILURE);
+   break;
+   default:
+   BUG();
+   }
+}
+
 static void do_fault(struct work_struct *work)
 {
struct fault *fault = container_of(work, struct fault, work);
-   int npages, write;
-   struct page *page;
+   struct mm_struct *mm;
+   struct vm_area_struct *vma;
+   u64 address;
+   int ret, write;
 
write = !!(fault-flags  PPR_FAULT_WRITE);
 
-   down_read(fault-state-mm-mmap_sem);
-   npages = get_user_pages(NULL, fault-state-mm,
-   fault-address, 1, write, 0, page, NULL);
-   up_read(fault-state-mm-mmap_sem);
-
-   if (npages == 1) {
-   put_page(page);
-   } else if (fault-dev_state-inv_ppr_cb) {
-   int status;
-
-   status = fault-dev_state-inv_ppr_cb(fault-dev_state-pdev,
- fault-pasid,
- fault-address,
- fault-flags);
-   switch (status) {
-   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_SUCCESS);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_INVALID:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_INVALID);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_FAIL:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_FAILURE);
-   break;
-   default:
-   BUG();
-   }
-   } else {
-   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   mm = fault-state-mm;
+   address = fault-address;
+
+   down_read(mm-mmap_sem);
+   vma = find_extend_vma(mm, address);
+   if (!vma || address  vma-vm_start) {
+   /* failed to get a vma in the right range */
+   up_read(mm-mmap_sem);
+   handle_fault_error(fault);
+   goto out;
+   }
+
+   ret = handle_mm_fault(mm, vma, address, write);
+   if (ret  VM_FAULT_ERROR) {
+   /* failed to service fault */
+   up_read(mm-mmap_sem);
+   handle_fault_error(fault);
+   goto out;
}
 
+   up_read(mm-mmap_sem);
+
+out:
finish_pri_tag(fault-dev_state, fault-state, fault-tag);
 
put_pasid_state(fault-state);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-05 Thread Jesse Barnes
On Wed, 5 Nov 2014 13:03:51 +0100
Joerg Roedel  wrote:

> Hi Oded, Jesse,
> 
> On Wed, Oct 29, 2014 at 11:33:38AM +0200, Oded Gabbay wrote:
> > I tested our amdkfd driver with your patches applied (kernel 3.17.1).
> > I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
> > 
> > All tests passed and I didn't see any kernel error messages.
> > 
> > So:
> > 
> > Tested-by: Oded Gabbay 
> 
> Thanks for testing Oded. Jesse, the patch looks good to me, except the
> task accounting for the page-faults. I'd like to get rid of using
> task_struct in the IOMMUv2 driver entirely if possible. Also it is not
> really the CPU task causing the faults, but some non-CPU process.

Hm, but the CPU task initiates the activity on the GPU, so we should
account for it somewhere, right?  I guess I had been thinking of the
"task" as spanning the CPUs and GPUs and other devices in the system,
rather than just representing the CPU activity.

> So can you please remove that code and resend the patches with Oded's
> Tested-by and Andrew Morton on Cc? I think these patches should go
> through the -mm tree.

Sure, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-11-05 Thread Jesse Barnes
On Wed, 5 Nov 2014 13:03:51 +0100
Joerg Roedel jroe...@suse.de wrote:

 Hi Oded, Jesse,
 
 On Wed, Oct 29, 2014 at 11:33:38AM +0200, Oded Gabbay wrote:
  I tested our amdkfd driver with your patches applied (kernel 3.17.1).
  I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
  
  All tests passed and I didn't see any kernel error messages.
  
  So:
  
  Tested-by: Oded Gabbay oded.gab...@amd.com
 
 Thanks for testing Oded. Jesse, the patch looks good to me, except the
 task accounting for the page-faults. I'd like to get rid of using
 task_struct in the IOMMUv2 driver entirely if possible. Also it is not
 really the CPU task causing the faults, but some non-CPU process.

Hm, but the CPU task initiates the activity on the GPU, so we should
account for it somewhere, right?  I guess I had been thinking of the
task as spanning the CPUs and GPUs and other devices in the system,
rather than just representing the CPU activity.

 So can you please remove that code and resend the patches with Oded's
 Tested-by and Andrew Morton on Cc? I think these patches should go
 through the -mm tree.

Sure, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-29 Thread Jesse Barnes
Cool, thanks a lot, Oded.  I guess my compiler did a good job making
sure there were no bugs. :)

Jesse

On Wed, 29 Oct 2014 11:33:38 +0200
Oded Gabbay  wrote:

> Hi Joerg and Jesse,
> 
> I tested our amdkfd driver with your patches applied (kernel 3.17.1).
> I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
> 
> All tests passed and I didn't see any kernel error messages.
> 
> So:
> 
> Tested-by: Oded Gabbay 
> 
> Oded
> 
> On 10/27/2014 05:35 PM, Jesse Barnes wrote:
> 
> Thanks, I have no way of testing this, but I'm hopeful. :)
> 
> Jesse
> 
> On Mon, 27 Oct 2014 17:15:45 +0200
> Oded Gabbay  wrote:
> 
> > Sure, no problem
> >
> >   Oded
> >
> > On 10/27/2014 05:13 PM, Joerg Roedel wrote:
> >
> > Hi Oded,
> >
> > can you please test these patches with the KFD driver and make sure
> > nothing breaks for you? I really like this improvement and it would be
> > great to send it upstream for v3.19.
> >
> > Thanks,
> >
> > Joerg
> >
> > On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
> >
> >> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> >> simply, rather than doing tricks with page refs and get_user_pages().
> >>
> >> Signed-off-by: Jesse Barnes 
> >> ---
> >>mm/memory.c | 1 +
> >>mm/mmap.c   | 2 ++
> >>2 files changed, 3 insertions(+)
> >>
> >> diff --git a/mm/memory.c b/mm/memory.c
> >> index 1cc6bfb..969ff0c 100644
> >> --- a/mm/memory.c
> >> +++ b/mm/memory.c
> >> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
> >> vm_area_struct *vma,
> >>
> >>return ret;
> >>}
> >> +EXPORT_SYMBOL_GPL(handle_mm_fault);
> >>
> >>#ifndef __PAGETABLE_PUD_FOLDED
> >>/*
> >> diff --git a/mm/mmap.c b/mm/mmap.c
> >> index 7f85520..2ee7971 100644
> >> --- a/mm/mmap.c
> >> +++ b/mm/mmap.c
> >> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long 
> >> addr)
> >>}
> >>#endif
> >>
> >> +EXPORT_SYMBOL_GPL(find_extend_vma);
> >> +
> >>/*
> >> * Ok - we have the memory areas we should free on the vma list,
> >> * so release them, and do the vma updates.
> >> -- 
> >> 1.9.1
> 
> 
> 


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-29 Thread Jesse Barnes
Cool, thanks a lot, Oded.  I guess my compiler did a good job making
sure there were no bugs. :)

Jesse

On Wed, 29 Oct 2014 11:33:38 +0200
Oded Gabbay oded.gab...@amd.com wrote:

 Hi Joerg and Jesse,
 
 I tested our amdkfd driver with your patches applied (kernel 3.17.1).
 I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
 
 All tests passed and I didn't see any kernel error messages.
 
 So:
 
 Tested-by: Oded Gabbay oded.gab...@amd.com
 
 Oded
 
 On 10/27/2014 05:35 PM, Jesse Barnes wrote:
 
 Thanks, I have no way of testing this, but I'm hopeful. :)
 
 Jesse
 
 On Mon, 27 Oct 2014 17:15:45 +0200
 Oded Gabbay oded.gab...@amd.com wrote:
 
  Sure, no problem
 
Oded
 
  On 10/27/2014 05:13 PM, Joerg Roedel wrote:
 
  Hi Oded,
 
  can you please test these patches with the KFD driver and make sure
  nothing breaks for you? I really like this improvement and it would be
  great to send it upstream for v3.19.
 
  Thanks,
 
  Joerg
 
  On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
 
  This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
  simply, rather than doing tricks with page refs and get_user_pages().
 
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
  ---
 mm/memory.c | 1 +
 mm/mmap.c   | 2 ++
 2 files changed, 3 insertions(+)
 
  diff --git a/mm/memory.c b/mm/memory.c
  index 1cc6bfb..969ff0c 100644
  --- a/mm/memory.c
  +++ b/mm/memory.c
  @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
  vm_area_struct *vma,
 
 return ret;
 }
  +EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /*
  diff --git a/mm/mmap.c b/mm/mmap.c
  index 7f85520..2ee7971 100644
  --- a/mm/mmap.c
  +++ b/mm/mmap.c
  @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long 
  addr)
 }
 #endif
 
  +EXPORT_SYMBOL_GPL(find_extend_vma);
  +
 /*
  * Ok - we have the memory areas we should free on the vma list,
  * so release them, and do the vma updates.
  -- 
  1.9.1
 
 
 


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-27 Thread Jesse Barnes
Thanks, I have no way of testing this, but I'm hopeful. :)

Jesse

On Mon, 27 Oct 2014 17:15:45 +0200
Oded Gabbay  wrote:

> Sure, no problem
> 
>  Oded
> 
> On 10/27/2014 05:13 PM, Joerg Roedel wrote:
> 
> Hi Oded,
> 
> can you please test these patches with the KFD driver and make sure
> nothing breaks for you? I really like this improvement and it would be
> great to send it upstream for v3.19.
> 
> Thanks,
> 
>   Joerg
> 
> On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
> 
> > This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> > simply, rather than doing tricks with page refs and get_user_pages().
> >
> > Signed-off-by: Jesse Barnes 
> > ---
> >   mm/memory.c | 1 +
> >   mm/mmap.c   | 2 ++
> >   2 files changed, 3 insertions(+)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 1cc6bfb..969ff0c 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
> > vm_area_struct *vma,
> >   
> > return ret;
> >   }
> > +EXPORT_SYMBOL_GPL(handle_mm_fault);
> >   
> >   #ifndef __PAGETABLE_PUD_FOLDED
> >   /*
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 7f85520..2ee7971 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long 
> > addr)
> >   }
> >   #endif
> >   
> > +EXPORT_SYMBOL_GPL(find_extend_vma);
> > +
> >   /*
> >* Ok - we have the memory areas we should free on the vma list,
> >* so release them, and do the vma updates.
> > -- 
> > 1.9.1
> 
> 


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-27 Thread Jesse Barnes
Thanks, I have no way of testing this, but I'm hopeful. :)

Jesse

On Mon, 27 Oct 2014 17:15:45 +0200
Oded Gabbay oded.gab...@amd.com wrote:

 Sure, no problem
 
  Oded
 
 On 10/27/2014 05:13 PM, Joerg Roedel wrote:
 
 Hi Oded,
 
 can you please test these patches with the KFD driver and make sure
 nothing breaks for you? I really like this improvement and it would be
 great to send it upstream for v3.19.
 
 Thanks,
 
   Joerg
 
 On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
 
  This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
  simply, rather than doing tricks with page refs and get_user_pages().
 
  Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
  ---
mm/memory.c | 1 +
mm/mmap.c   | 2 ++
2 files changed, 3 insertions(+)
 
  diff --git a/mm/memory.c b/mm/memory.c
  index 1cc6bfb..969ff0c 100644
  --- a/mm/memory.c
  +++ b/mm/memory.c
  @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
  vm_area_struct *vma,

  return ret;
}
  +EXPORT_SYMBOL_GPL(handle_mm_fault);

#ifndef __PAGETABLE_PUD_FOLDED
/*
  diff --git a/mm/mmap.c b/mm/mmap.c
  index 7f85520..2ee7971 100644
  --- a/mm/mmap.c
  +++ b/mm/mmap.c
  @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long 
  addr)
}
#endif

  +EXPORT_SYMBOL_GPL(find_extend_vma);
  +
/*
 * Ok - we have the memory areas we should free on the vma list,
 * so release them, and do the vma updates.
  -- 
  1.9.1
 
 


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] iommu/amd: use handle_mm_fault directly

2014-10-24 Thread Jesse Barnes
This could be useful for debug in the future if we want to track
major/minor faults more closely, and also avoids the put_page trick we
used with gup.

In order to do this, we also track the task struct in the PASID state
structure.  This lets us update the appropriate task stats after the
fault has been handled, and may aid with debug in the future as well.

Signed-off-by: Jesse Barnes 
---
 drivers/iommu/amd_iommu_v2.c | 93 +---
 1 file changed, 62 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 90d734b..b23481b 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -47,6 +47,7 @@ struct pasid_state {
atomic_t count; /* Reference count */
unsigned mmu_notifier_count;/* Counting nested mmu_notifier
   calls */
+   struct task_struct *task;   /* task_struct for accounting */
struct mm_struct *mm;   /* mm_struct for the faults */
struct mmu_notifier mn; /* mmu_notifier handle */
struct pri_queue pri[PRI_QUEUE_SIZE];   /* PRI tag states */
@@ -513,45 +514,74 @@ static void finish_pri_tag(struct device_state *dev_state,
spin_unlock_irqrestore(_state->lock, flags);
 }
 
+static void handle_fault_error(struct fault *fault)
+{
+   int status;
+
+   if (!fault->dev_state->inv_ppr_cb) {
+   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   return;
+   }
+
+   status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
+ fault->pasid,
+ fault->address,
+ fault->flags);
+   switch (status) {
+   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
+   set_pri_tag_status(fault->state, fault->tag, PPR_SUCCESS);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_INVALID:
+   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_FAIL:
+   set_pri_tag_status(fault->state, fault->tag, PPR_FAILURE);
+   break;
+   default:
+   BUG();
+   }
+}
+
 static void do_fault(struct work_struct *work)
 {
struct fault *fault = container_of(work, struct fault, work);
-   int npages, write;
-   struct page *page;
+   struct mm_struct *mm;
+   struct vm_area_struct *vma;
+   struct task_struct *task;
+   u64 address;
+   int ret, write;
 
write = !!(fault->flags & PPR_FAULT_WRITE);
 
-   down_read(>state->mm->mmap_sem);
-   npages = get_user_pages(NULL, fault->state->mm,
-   fault->address, 1, write, 0, , NULL);
-   up_read(>state->mm->mmap_sem);
-
-   if (npages == 1) {
-   put_page(page);
-   } else if (fault->dev_state->inv_ppr_cb) {
-   int status;
-
-   status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
- fault->pasid,
- fault->address,
- fault->flags);
-   switch (status) {
-   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_SUCCESS);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_INVALID:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_INVALID);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_FAIL:
-   set_pri_tag_status(fault->state, fault->tag, 
PPR_FAILURE);
-   break;
-   default:
-   BUG();
-   }
-   } else {
-   set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+   task = fault->state->task;
+   mm = fault->state->mm;
+   address = fault->address;
+
+   down_read(>mmap_sem);
+   vma = find_extend_vma(mm, address);
+   if (!vma || address < vma->vm_start) {
+   /* failed to get a vma in the right range */
+   up_read(>mmap_sem);
+   handle_fault_error(fault);
+   goto out;
}
 
+   ret = handle_mm_fault(mm, vma, address, write);
+   if (ret & VM_FAULT_ERROR) {
+   /* failed to service fault */
+   up_read(>mmap_sem);
+   handle_fault_error(fault);
+   goto out;
+   }
+
+   if (ret & VM_FAULT_MAJOR)
+   task->maj_flt++;
+  

[PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-24 Thread Jesse Barnes
This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
simply, rather than doing tricks with page refs and get_user_pages().

Signed-off-by: Jesse Barnes 
---
 mm/memory.c | 1 +
 mm/mmap.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 1cc6bfb..969ff0c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
return ret;
 }
+EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /*
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f85520..2ee7971 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
 }
 #endif
 
+EXPORT_SYMBOL_GPL(find_extend_vma);
+
 /*
  * Ok - we have the memory areas we should free on the vma list,
  * so release them, and do the vma updates.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] iommu/amd: use handle_mm_fault directly

2014-10-24 Thread Jesse Barnes
This could be useful for debug in the future if we want to track
major/minor faults more closely, and also avoids the put_page trick we
used with gup.

In order to do this, we also track the task struct in the PASID state
structure.  This lets us update the appropriate task stats after the
fault has been handled, and may aid with debug in the future as well.

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/iommu/amd_iommu_v2.c | 93 +---
 1 file changed, 62 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 90d734b..b23481b 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -47,6 +47,7 @@ struct pasid_state {
atomic_t count; /* Reference count */
unsigned mmu_notifier_count;/* Counting nested mmu_notifier
   calls */
+   struct task_struct *task;   /* task_struct for accounting */
struct mm_struct *mm;   /* mm_struct for the faults */
struct mmu_notifier mn; /* mmu_notifier handle */
struct pri_queue pri[PRI_QUEUE_SIZE];   /* PRI tag states */
@@ -513,45 +514,74 @@ static void finish_pri_tag(struct device_state *dev_state,
spin_unlock_irqrestore(pasid_state-lock, flags);
 }
 
+static void handle_fault_error(struct fault *fault)
+{
+   int status;
+
+   if (!fault-dev_state-inv_ppr_cb) {
+   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   return;
+   }
+
+   status = fault-dev_state-inv_ppr_cb(fault-dev_state-pdev,
+ fault-pasid,
+ fault-address,
+ fault-flags);
+   switch (status) {
+   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
+   set_pri_tag_status(fault-state, fault-tag, PPR_SUCCESS);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_INVALID:
+   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   break;
+   case AMD_IOMMU_INV_PRI_RSP_FAIL:
+   set_pri_tag_status(fault-state, fault-tag, PPR_FAILURE);
+   break;
+   default:
+   BUG();
+   }
+}
+
 static void do_fault(struct work_struct *work)
 {
struct fault *fault = container_of(work, struct fault, work);
-   int npages, write;
-   struct page *page;
+   struct mm_struct *mm;
+   struct vm_area_struct *vma;
+   struct task_struct *task;
+   u64 address;
+   int ret, write;
 
write = !!(fault-flags  PPR_FAULT_WRITE);
 
-   down_read(fault-state-mm-mmap_sem);
-   npages = get_user_pages(NULL, fault-state-mm,
-   fault-address, 1, write, 0, page, NULL);
-   up_read(fault-state-mm-mmap_sem);
-
-   if (npages == 1) {
-   put_page(page);
-   } else if (fault-dev_state-inv_ppr_cb) {
-   int status;
-
-   status = fault-dev_state-inv_ppr_cb(fault-dev_state-pdev,
- fault-pasid,
- fault-address,
- fault-flags);
-   switch (status) {
-   case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_SUCCESS);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_INVALID:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_INVALID);
-   break;
-   case AMD_IOMMU_INV_PRI_RSP_FAIL:
-   set_pri_tag_status(fault-state, fault-tag, 
PPR_FAILURE);
-   break;
-   default:
-   BUG();
-   }
-   } else {
-   set_pri_tag_status(fault-state, fault-tag, PPR_INVALID);
+   task = fault-state-task;
+   mm = fault-state-mm;
+   address = fault-address;
+
+   down_read(mm-mmap_sem);
+   vma = find_extend_vma(mm, address);
+   if (!vma || address  vma-vm_start) {
+   /* failed to get a vma in the right range */
+   up_read(mm-mmap_sem);
+   handle_fault_error(fault);
+   goto out;
}
 
+   ret = handle_mm_fault(mm, vma, address, write);
+   if (ret  VM_FAULT_ERROR) {
+   /* failed to service fault */
+   up_read(mm-mmap_sem);
+   handle_fault_error(fault);
+   goto out;
+   }
+
+   if (ret  VM_FAULT_MAJOR)
+   task-maj_flt++;
+   else
+   task-min_flt++;
+
+   up_read(mm-mmap_sem);
+
+out:
finish_pri_tag(fault-dev_state, fault-state, fault-tag);
 
put_pasid_state(fault-state

[PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

2014-10-24 Thread Jesse Barnes
This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
simply, rather than doing tricks with page refs and get_user_pages().

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 mm/memory.c | 1 +
 mm/mmap.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 1cc6bfb..969ff0c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
return ret;
 }
+EXPORT_SYMBOL_GPL(handle_mm_fault);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /*
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f85520..2ee7971 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
 }
 #endif
 
+EXPORT_SYMBOL_GPL(find_extend_vma);
+
 /*
  * Ok - we have the memory areas we should free on the vma list,
  * so release them, and do the vma updates.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kbuild: use cscope -q option

2014-10-16 Thread Jesse Barnes
This makes symbol lookups significantly faster at the cost of some disk
space.

Signed-off-by: Jesse Barnes 
---
 scripts/tags.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/tags.sh b/scripts/tags.sh
index 293828b..3de7719 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -142,7 +142,7 @@ all_defconfigs()
 docscope()
 {
(echo \-k; echo \-q; all_target_sources) > cscope.files
-   cscope -b -f cscope.out
+   cscope -bq -f cscope.out
 }
 
 dogtags()
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] kbuild: use cscope -q option

2014-10-16 Thread Jesse Barnes
This makes symbol lookups significantly faster at the cost of some disk
space.

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 scripts/tags.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/tags.sh b/scripts/tags.sh
index 293828b..3de7719 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -142,7 +142,7 @@ all_defconfigs()
 docscope()
 {
(echo \-k; echo \-q; all_target_sources)  cscope.files
-   cscope -b -f cscope.out
+   cscope -bq -f cscope.out
 }
 
 dogtags()
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] android: add sync_fence_create_dma

2014-08-15 Thread Jesse Barnes
On Fri, 15 Aug 2014 14:46:56 +0800
Greg Kroah-Hartman  wrote:

> On Thu, Aug 14, 2014 at 11:54:52AM +0200, Maarten Lankhorst wrote:
> > This allows users of dma fences to create a android fence.
> 
> Who is going to use these functions?  I need an in-kernel user before I
> can add new api calls.

I have some code that uses them, but I need a userspace user to push my
stuff. :)

I'm working on the latter bit too in Mesa, and we have demand from
Android, so we should get some users shortly.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] android: add sync_fence_create_dma

2014-08-15 Thread Jesse Barnes
On Fri, 15 Aug 2014 14:46:56 +0800
Greg Kroah-Hartman gre...@linuxfoundation.org wrote:

 On Thu, Aug 14, 2014 at 11:54:52AM +0200, Maarten Lankhorst wrote:
  This allows users of dma fences to create a android fence.
 
 Who is going to use these functions?  I need an in-kernel user before I
 can add new api calls.

I have some code that uses them, but I need a userspace user to push my
stuff. :)

I'm working on the latter bit too in Mesa, and we have demand from
Android, so we should get some users shortly.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] android: add sync_fence_create_dma

2014-08-14 Thread Jesse Barnes
On Thu, 14 Aug 2014 11:54:52 +0200
Maarten Lankhorst  wrote:

> This allows users of dma fences to create a android fence.
> 
> Signed-off-by: Maarten Lankhorst 
> Cc: Daniel Vetter 
> Cc: Jesse Barnes 
> ---
>  drivers/staging/android/sync.c | 24 
>  drivers/staging/android/sync.h | 11 +++
>  2 files changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
> index 69139ce7420d..c9331250ac26 100644
> --- a/drivers/staging/android/sync.c
> +++ b/drivers/staging/android/sync.c
> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct 
> fence_cb *cb)
>  }
>  
>  /* TODO: implement a create which takes more that one sync_pt */
> -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +static struct sync_fence *sync_fence_create_noref(const char *name, struct 
> fence *pt)
>  {
>   struct sync_fence *fence;
>  
> @@ -199,16 +199,32 @@ struct sync_fence *sync_fence_create(const char *name, 
> struct sync_pt *pt)
>   fence->num_fences = 1;
>   atomic_set(>status, 1);
>  
> - fence->cbs[0].sync_pt = >base;
> + fence->cbs[0].sync_pt = pt;
>   fence->cbs[0].fence = fence;
> - if (fence_add_callback(>base, >cbs[0].cb,
> -fence_check_cb_func))
> + if (fence_add_callback(pt, >cbs[0].cb, fence_check_cb_func))
>   atomic_dec(>status);
>  
>   sync_fence_debug_add(fence);
>  
>   return fence;
>  }
> +
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
> +{
> + struct sync_fence *fence;

I ran into the same naming trouble in my implementation; I think I
ended up with sfence for sync fence declarations.

> +
> + fence = sync_fence_create_noref(name, fence_get(pt));
> + if (!fence)
> + fence_put(pt);
> +
> + return fence;
> +}
> +EXPORT_SYMBOL(sync_fence_create_dma);
> +
> +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +{
> + return sync_fence_create_noref(name, >base);
> +}
>  EXPORT_SYMBOL(sync_fence_create);
>  
>  struct sync_fence *sync_fence_fdget(int fd)
> diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
> index 66b0f431f63e..7b3bf560790c 100644
> --- a/drivers/staging/android/sync.h
> +++ b/drivers/staging/android/sync.h
> @@ -254,6 +254,17 @@ void sync_pt_free(struct sync_pt *pt);
>   */
>  struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
>  
> +/**
> + * sync_fence_create_dma() - creates a sync fence from a dma fence
> + * @name:name of fence to create
> + * @pt:  dma fence to add to the sync fence
> + *
> + * Creates a fence containg @pt.  Once this is called, the fence takes
> + * a reference on @pt, unlike sync_fence_create which doesn't add one.
> + */
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
> +
> +
>  /*
>   * API for sync_fence consumers
>   */

Yeah, I've been using this, looks good.

Reviewed-by: Jesse Barnes 

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] android: add sync_fence_create_dma

2014-08-14 Thread Jesse Barnes
On Thu, 14 Aug 2014 11:54:52 +0200
Maarten Lankhorst maarten.lankho...@canonical.com wrote:

 This allows users of dma fences to create a android fence.
 
 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com
 Cc: Daniel Vetter dan...@ffwll.ch
 Cc: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/staging/android/sync.c | 24 
  drivers/staging/android/sync.h | 11 +++
  2 files changed, 31 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
 index 69139ce7420d..c9331250ac26 100644
 --- a/drivers/staging/android/sync.c
 +++ b/drivers/staging/android/sync.c
 @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct 
 fence_cb *cb)
  }
  
  /* TODO: implement a create which takes more that one sync_pt */
 -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
 +static struct sync_fence *sync_fence_create_noref(const char *name, struct 
 fence *pt)
  {
   struct sync_fence *fence;
  
 @@ -199,16 +199,32 @@ struct sync_fence *sync_fence_create(const char *name, 
 struct sync_pt *pt)
   fence-num_fences = 1;
   atomic_set(fence-status, 1);
  
 - fence-cbs[0].sync_pt = pt-base;
 + fence-cbs[0].sync_pt = pt;
   fence-cbs[0].fence = fence;
 - if (fence_add_callback(pt-base, fence-cbs[0].cb,
 -fence_check_cb_func))
 + if (fence_add_callback(pt, fence-cbs[0].cb, fence_check_cb_func))
   atomic_dec(fence-status);
  
   sync_fence_debug_add(fence);
  
   return fence;
  }
 +
 +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
 +{
 + struct sync_fence *fence;

I ran into the same naming trouble in my implementation; I think I
ended up with sfence for sync fence declarations.

 +
 + fence = sync_fence_create_noref(name, fence_get(pt));
 + if (!fence)
 + fence_put(pt);
 +
 + return fence;
 +}
 +EXPORT_SYMBOL(sync_fence_create_dma);
 +
 +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
 +{
 + return sync_fence_create_noref(name, pt-base);
 +}
  EXPORT_SYMBOL(sync_fence_create);
  
  struct sync_fence *sync_fence_fdget(int fd)
 diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
 index 66b0f431f63e..7b3bf560790c 100644
 --- a/drivers/staging/android/sync.h
 +++ b/drivers/staging/android/sync.h
 @@ -254,6 +254,17 @@ void sync_pt_free(struct sync_pt *pt);
   */
  struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
  
 +/**
 + * sync_fence_create_dma() - creates a sync fence from a dma fence
 + * @name:name of fence to create
 + * @pt:  dma fence to add to the sync fence
 + *
 + * Creates a fence containg @pt.  Once this is called, the fence takes
 + * a reference on @pt, unlike sync_fence_create which doesn't add one.
 + */
 +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
 +
 +
  /*
   * API for sync_fence consumers
   */

Yeah, I've been using this, looks good.

Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Usage of _PAGE_PCD et al in i915 driver

2014-08-13 Thread Jesse Barnes
On Fri, 8 Aug 2014 15:14:15 +0200
Daniel Vetter  wrote:

> Adding relevant mailing lists.
> 
> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross  wrote:
> > I'm just about to create a patch for full PAT support in the Linux
> > kernel, including Xen. For this purpose I introduce a translation
> > between cache modes and pte bits.
> >
> > Scanning the kernel sources for usage of the cache mode bits in the
> > pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
> > _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
> > to create ptes not for usage by the main processor, but for the
> > graphics processor. Is this true? In this case I'd suggest to define
> > i915-specific macros instead of using the x86 ones.
> 
> Yeah, those are gpu specific PAT tables, but the hw engineers
> specifically designed this to match, and we've tried to follow the cpu
> side to match it. Especially in the future that will be somewhat
> important, since we want to fully share the entire address space
> between cpu and gpu on the next platform. Jesse is working on that.

Right, we have an x86 compatible MMU in the GPU itself, so re-using the
defines makes sense.  I suppose with your work you'll move them and
make them a bit more opaque?  If so, we'll still want a way to get at
them directly, or access your mapping functions for generating PTE bits
for the GPU MMU.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Usage of _PAGE_PCD et al in i915 driver

2014-08-13 Thread Jesse Barnes
On Fri, 8 Aug 2014 15:14:15 +0200
Daniel Vetter daniel.vet...@ffwll.ch wrote:

 Adding relevant mailing lists.
 
 On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross jgr...@suse.com wrote:
  I'm just about to create a patch for full PAT support in the Linux
  kernel, including Xen. For this purpose I introduce a translation
  between cache modes and pte bits.
 
  Scanning the kernel sources for usage of the cache mode bits in the
  pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
  _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
  to create ptes not for usage by the main processor, but for the
  graphics processor. Is this true? In this case I'd suggest to define
  i915-specific macros instead of using the x86 ones.
 
 Yeah, those are gpu specific PAT tables, but the hw engineers
 specifically designed this to match, and we've tried to follow the cpu
 side to match it. Especially in the future that will be somewhat
 important, since we want to fully share the entire address space
 between cpu and gpu on the next platform. Jesse is working on that.

Right, we have an x86 compatible MMU in the GPU itself, so re-using the
defines makes sense.  I suppose with your work you'll move them and
make them a bit more opaque?  If so, we'll still want a way to get at
them directly, or access your mapping functions for generating PTE bits
for the GPU MMU.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] mmu_notifier: Add mmu_notifier_invalidate_range()

2014-07-25 Thread Jesse Barnes
On Fri, 25 Jul 2014 23:38:06 +0200
Joerg Roedel  wrote:

> On Fri, Jul 25, 2014 at 01:16:39PM -0700, Jesse Barnes wrote:
> > > To allow managing external TLBs the MMU-notifiers need to
> > > catch the moment when pages are unmapped but not yet freed.
> > > This new notifier catches that moment and notifies the
> > > interested subsytem when pages that were unmapped are about
> > > to be freed. The new notifier will only be called between
> > > invalidate_range_start()/end().
> > 
> > So if we were actually sharing page tables, we should be able to make
> > start/end no-ops and just use this new callback, assuming we didn't
> > need to do any other serialization or debug stuff, right?
> 
> Well, not completly. What you need with this patch-set is a
> invalidate_range and an invalidate_end call-back. There are call sites
> of the start/end functions where the TLB flush happens after the _end
> notifier (or at least can wait until _end is called). I did not add
> invalidate_range calls to these places (yet). But you can easily discard
> invalidate_range_start, any flush done in there is useless with shared
> page-tables.
> 
> I though about removing the need for invalidate_range_end too when
> writing the patches, and possible solutions are
> 
>   1) Add mmu_notifier_invalidate_range() to all places where
>  start/end is called too. This might add some unnecessary
>  overhead.
> 
>   2) Call the invalidate_range() call-back from the
>  mmu_notifier_invalidate_range_end too.
> 
>   3) Just let the user register the same function for
>  invalidate_range and invalidate_range_end
> 
> I though that option 1) adds overhead that is not needed (but it might
> not be too bad, the overhead is an additional iteration over the
> mmu_notifer list when there are no call-backs registered).
> 
> Option 2) might also be overhead if a user registers different functions
> for invalidate_range() and invalidate_range_end(). In the end I came to
> the conclusion that option 3) is the best one from an overhead POV.
> 
> But probably targeting better usability with one of the other options is
> a better choice? I am open for thoughts and suggestions on that.

Making the _end callback just do another TLB flush is fine too, but it
would be nice to have the consistency of (1).  I can live with either
though, as long as the callbacks are well documented.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] mmu_notifier: Add mmu_notifier_invalidate_range()

2014-07-25 Thread Jesse Barnes
On Thu, 24 Jul 2014 16:35:39 +0200
Joerg Roedel  wrote:

> From: Joerg Roedel 
> 
> This notifier closes an important gap with the current
> invalidate_range_start()/end() notifiers. The _start() part
> is called when all pages are still mapped while the _end()
> notifier is called when all pages are potentially unmapped
> and already freed.
> 
> This does not allow to manage external (non-CPU) hardware
> TLBs with MMU-notifiers because there is no way to prevent
> that hardware will establish new TLB entries between the
> calls of these two functions. But this is a requirement to
> the subsytem that implements these existing notifiers.
> 
> To allow managing external TLBs the MMU-notifiers need to
> catch the moment when pages are unmapped but not yet freed.
> This new notifier catches that moment and notifies the
> interested subsytem when pages that were unmapped are about
> to be freed. The new notifier will only be called between
> invalidate_range_start()/end().

So if we were actually sharing page tables, we should be able to make
start/end no-ops and just use this new callback, assuming we didn't
need to do any other serialization or debug stuff, right?

Seems like a good addition, and saves us a bunch of trouble...

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] mmu_notifier: Add mmu_notifier_invalidate_range()

2014-07-25 Thread Jesse Barnes
On Thu, 24 Jul 2014 16:35:39 +0200
Joerg Roedel j...@8bytes.org wrote:

 From: Joerg Roedel jroe...@suse.de
 
 This notifier closes an important gap with the current
 invalidate_range_start()/end() notifiers. The _start() part
 is called when all pages are still mapped while the _end()
 notifier is called when all pages are potentially unmapped
 and already freed.
 
 This does not allow to manage external (non-CPU) hardware
 TLBs with MMU-notifiers because there is no way to prevent
 that hardware will establish new TLB entries between the
 calls of these two functions. But this is a requirement to
 the subsytem that implements these existing notifiers.
 
 To allow managing external TLBs the MMU-notifiers need to
 catch the moment when pages are unmapped but not yet freed.
 This new notifier catches that moment and notifies the
 interested subsytem when pages that were unmapped are about
 to be freed. The new notifier will only be called between
 invalidate_range_start()/end().

So if we were actually sharing page tables, we should be able to make
start/end no-ops and just use this new callback, assuming we didn't
need to do any other serialization or debug stuff, right?

Seems like a good addition, and saves us a bunch of trouble...

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] mmu_notifier: Add mmu_notifier_invalidate_range()

2014-07-25 Thread Jesse Barnes
On Fri, 25 Jul 2014 23:38:06 +0200
Joerg Roedel j...@8bytes.org wrote:

 On Fri, Jul 25, 2014 at 01:16:39PM -0700, Jesse Barnes wrote:
   To allow managing external TLBs the MMU-notifiers need to
   catch the moment when pages are unmapped but not yet freed.
   This new notifier catches that moment and notifies the
   interested subsytem when pages that were unmapped are about
   to be freed. The new notifier will only be called between
   invalidate_range_start()/end().
  
  So if we were actually sharing page tables, we should be able to make
  start/end no-ops and just use this new callback, assuming we didn't
  need to do any other serialization or debug stuff, right?
 
 Well, not completly. What you need with this patch-set is a
 invalidate_range and an invalidate_end call-back. There are call sites
 of the start/end functions where the TLB flush happens after the _end
 notifier (or at least can wait until _end is called). I did not add
 invalidate_range calls to these places (yet). But you can easily discard
 invalidate_range_start, any flush done in there is useless with shared
 page-tables.
 
 I though about removing the need for invalidate_range_end too when
 writing the patches, and possible solutions are
 
   1) Add mmu_notifier_invalidate_range() to all places where
  start/end is called too. This might add some unnecessary
  overhead.
 
   2) Call the invalidate_range() call-back from the
  mmu_notifier_invalidate_range_end too.
 
   3) Just let the user register the same function for
  invalidate_range and invalidate_range_end
 
 I though that option 1) adds overhead that is not needed (but it might
 not be too bad, the overhead is an additional iteration over the
 mmu_notifer list when there are no call-backs registered).
 
 Option 2) might also be overhead if a user registers different functions
 for invalidate_range() and invalidate_range_end(). In the end I came to
 the conclusion that option 3) is the best one from an overhead POV.
 
 But probably targeting better usability with one of the other options is
 a better choice? I am open for thoughts and suggestions on that.

Making the _end callback just do another TLB flush is fine too, but it
would be nice to have the consistency of (1).  I can live with either
though, as long as the callbacks are well documented.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Nouveau] [PATCH 09/17] drm/radeon: use common fence implementation for fences

2014-07-23 Thread Jesse Barnes
On Wed, 23 Jul 2014 11:47:15 +0200
Daniel Vetter  wrote:

> On Tue, Jul 22, 2014 at 9:14 PM, Jesse Barnes  
> wrote:
> >> We don't have the code yet ready, but that's the direction i915 will
> >> move towards for the near future. Jesse is working on some patches
> >> already.
> >
> > Yeah I'd like to get some feedback from Maarten on my bits so I can get
> > them ready for upstream.  I still need to add documentation and tests,
> > but I'd like to make sure the interfaces and internals get acked first.
> 
> Review works better if you supply a pointer to the patches ;-) I asked
> Maarten whether he looked at it and he said he didn't know where ...

Oh I provided it in IRC earlier, figured Maarten was just busy. :)

Tree is android-dma-buf-i915-fences in my fdo linux repo.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Nouveau] [PATCH 09/17] drm/radeon: use common fence implementation for fences

2014-07-23 Thread Jesse Barnes
On Wed, 23 Jul 2014 11:47:15 +0200
Daniel Vetter dan...@ffwll.ch wrote:

 On Tue, Jul 22, 2014 at 9:14 PM, Jesse Barnes jbar...@virtuousgeek.org 
 wrote:
  We don't have the code yet ready, but that's the direction i915 will
  move towards for the near future. Jesse is working on some patches
  already.
 
  Yeah I'd like to get some feedback from Maarten on my bits so I can get
  them ready for upstream.  I still need to add documentation and tests,
  but I'd like to make sure the interfaces and internals get acked first.
 
 Review works better if you supply a pointer to the patches ;-) I asked
 Maarten whether he looked at it and he said he didn't know where ...

Oh I provided it in IRC earlier, figured Maarten was just busy. :)

Tree is android-dma-buf-i915-fences in my fdo linux repo.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/17] drm/radeon: use common fence implementation for fences

2014-07-22 Thread Jesse Barnes
On Tue, 22 Jul 2014 17:48:18 +0200
Daniel Vetter  wrote:

> On Tue, Jul 22, 2014 at 5:42 PM, Alex Deucher  wrote:
> >> Fence-based syncing between userspace queues submitted stuff through
> >> doorbells and anything submitted by the general simply wont work.
> >> Which is why I think the doorbell is a stupid interface since I just
> >> don't see cameras and v4l devices implementing all that complexity to
> >> get a pure userspace side sync solution.
> >>
> >
> > Like it or not this is what a lot of application writers want (look at
> > mantle and metal and similar new APIs or android synpts).  Having
> > queues and fences in userspace allows the application to structure
> > things to best fit their own task graphs.  The app can decide how to
> > deal with dependencies and synchronization explicitly instead of
> > blocking the queues in the kernel for everyone.  Anyway, this is
> > getting off topic.
> 
> Well there's explicit fences as used in opencl and android syncpts. My
> plan is actually to support that in i915 using Maarten's struct fence
> stuff (and there's just a very trivial patch for the android stuff in
> merging needed to get there). What doesn't work is fences created
> behind the kernel's back purely in userspace by giving shared memory
> locations special meaning. Those get the kernel completely out of the
> picture (as opposed to android syncpts, which just make sync
> explicit).
> 
> I guess long-term we might need something like gpu futexes to make
> that pure userspace syncing integrate a bit better, but imo that's (at
> least for now) out of scope. For fences here I have the goal of one

Yeah, with a little kernel help you could have a mostly kernel
independent sync mechanism using just shared mem in userspace.  The
kernel would just need to signal any interested clients when something
happened (even if it didn't know what) and let userspace sort out the
rest.  I think that would be a nice thing to provide at some point, as
it could allow for some fine grained CPU/GPU algorithms that use
lightweight synchronization with and without busy looping on the CPU
side.

But all of that is definitely a lower priority than getting explicit
fencing exported to userspace to work right, both for intra-driver
sync and inter-driver sync.

> internally representation used by both implicit syncing (dma-buf on
> classic linux, e.g. prime) and explicit fencing on android or opencl
> or something like that.
> 
> We don't have the code yet ready, but that's the direction i915 will
> move towards for the near future. Jesse is working on some patches
> already.

Yeah I'd like to get some feedback from Maarten on my bits so I can get
them ready for upstream.  I still need to add documentation and tests,
but I'd like to make sure the interfaces and internals get acked first.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/17] drm/radeon: use common fence implementation for fences

2014-07-22 Thread Jesse Barnes
On Tue, 22 Jul 2014 17:48:18 +0200
Daniel Vetter dan...@ffwll.ch wrote:

 On Tue, Jul 22, 2014 at 5:42 PM, Alex Deucher alexdeuc...@gmail.com wrote:
  Fence-based syncing between userspace queues submitted stuff through
  doorbells and anything submitted by the general simply wont work.
  Which is why I think the doorbell is a stupid interface since I just
  don't see cameras and v4l devices implementing all that complexity to
  get a pure userspace side sync solution.
 
 
  Like it or not this is what a lot of application writers want (look at
  mantle and metal and similar new APIs or android synpts).  Having
  queues and fences in userspace allows the application to structure
  things to best fit their own task graphs.  The app can decide how to
  deal with dependencies and synchronization explicitly instead of
  blocking the queues in the kernel for everyone.  Anyway, this is
  getting off topic.
 
 Well there's explicit fences as used in opencl and android syncpts. My
 plan is actually to support that in i915 using Maarten's struct fence
 stuff (and there's just a very trivial patch for the android stuff in
 merging needed to get there). What doesn't work is fences created
 behind the kernel's back purely in userspace by giving shared memory
 locations special meaning. Those get the kernel completely out of the
 picture (as opposed to android syncpts, which just make sync
 explicit).
 
 I guess long-term we might need something like gpu futexes to make
 that pure userspace syncing integrate a bit better, but imo that's (at
 least for now) out of scope. For fences here I have the goal of one

Yeah, with a little kernel help you could have a mostly kernel
independent sync mechanism using just shared mem in userspace.  The
kernel would just need to signal any interested clients when something
happened (even if it didn't know what) and let userspace sort out the
rest.  I think that would be a nice thing to provide at some point, as
it could allow for some fine grained CPU/GPU algorithms that use
lightweight synchronization with and without busy looping on the CPU
side.

But all of that is definitely a lower priority than getting explicit
fencing exported to userspace to work right, both for intra-driver
sync and inter-driver sync.

 internally representation used by both implicit syncing (dma-buf on
 classic linux, e.g. prime) and explicit fencing on android or opencl
 or something like that.
 
 We don't have the code yet ready, but that's the direction i915 will
 move towards for the near future. Jesse is working on some patches
 already.

Yeah I'd like to get some feedback from Maarten on my bits so I can get
them ready for upstream.  I still need to add documentation and tests,
but I'd like to make sure the interfaces and internals get acked first.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.13 i915 brightness settings broken when going from docked -> undocked

2014-02-24 Thread Jesse Barnes
On Sun, 23 Feb 2014 10:50:59 -0500
Josh Boyer  wrote:

> On Thu, Feb 20, 2014 at 07:31:41PM -0500, Josh Boyer wrote:
> >On Wed, Feb 19, 2014 at 9:20 PM, Josh Boyer  
> >wrote:
> >> Hi All,
> >>
> >> We've had a rather weird report[1] of the brightness adjustments being
> >> broken in a specific case with Thinkpad x220 hardware (SandyBridge
> >> based).  If you boot the machine with it in a dock and then undock,
> >> the brightness adjustments do not work.  That is with either the FN
> >> keys or the GNOME brightness slider.
> >>
> >> I can see that the value of
> >> /sys/class/backlight/acpi_video0/brightness increases/decreases but
> >> /sys/class/backlight/intel_backlight/brightness doesn't reflect any
> >> changes.  With 3.12 this works, and oddly with 3.14-rc1 it works
> >> (specifically, it starts working around v3.13-10231-g53d8ab2 which is
> >> right after the first DRM merge for 3.14).  With 3.13, if I undock and
> >> echo a higher value in the intel_backlight_brightness sysfs entry, the
> >> brightness will actually increase so it can be done manually, but it
> >> does not work as you'd expect.
> >>
> >> I'm in the middle of trying to do a reverse bisect for which patch
> >> fixes it in the 3.14-rcX series, but that's taking a while.  I thought
> >> I'd email and see if anyone already knows about this situation, what
> >> patch in 3.13 broke this, and which one then fixed it again.  Thus far
> >> all I've gathered is that backlight handling is confusing.
> >
> >The reverse bisect between 3.13 and 3.14-rc1 didn't prove fruitful,
> >either because I messed it up or there's a combination of things that
> >fix the issue.  So instead I did a regular git bisect between 3.12 and
> >3.13 to see which commit _broke_ things and caused the above behavior.
> > That landed me at:
> >
> >Author: Jesse Barnes 
> >Date:   Thu Oct 31 18:55:49 2013 +0200
> >
> >drm/i915: make backlight functions take a connector
> >
> >I have no idea if that makes sense or not, but it's at least something
> >that seems to be in a relevant area of code.  Does anyone involved in
> >that know why it would cause the above symptoms on 3.13, and which
> >commit(s) fix it in 3.14-rc1?
> 
> Since nobody is replying I poked around a bit myself.  The one commit
> that looks somewhat relevant in 3.14-rc1 seems to be:
> 
> commit c91c9f32843a1b433de5a1ead4789a6bc8d3d914
> Author: Jani Nikula 
> Date:   Fri Nov 8 16:48:55 2013 +0200
> 
> drm/i915: make asle notifications update backlight on all connectors
> 
> That doesn't apply cleanly on 3.13 because of the other reworks that
> went in first, so I came up with the patch below.  It seems to fix it
> for my machine, but I'm waiting for confirmation from the original bug
> reporter still.  Maybe this will elicit some comments?
> 
> josh
> 
> Backport of upstream commit c91c9f328
> 
> ---
>  drivers/gpu/drm/i915/i915_drv.h   |  1 +
>  drivers/gpu/drm/i915/intel_opregion.c | 31 ++-
>  drivers/gpu/drm/i915/intel_panel.c|  4 
>  3 files changed, 11 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 221ac62..d6d4349 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1371,6 +1371,7 @@ typedef struct drm_i915_private {
>  
>   /* backlight */
>   struct {
> + bool present;
>   int level;
>   bool enabled;
>   spinlock_t lock; /* bl registers and the above bl fields */
> diff --git a/drivers/gpu/drm/i915/intel_opregion.c 
> b/drivers/gpu/drm/i915/intel_opregion.c
> index 6d69a9b..b2a51ae 100644
> --- a/drivers/gpu/drm/i915/intel_opregion.c
> +++ b/drivers/gpu/drm/i915/intel_opregion.c
> @@ -414,38 +414,19 @@ static u32 asle_set_backlight(struct drm_device *dev, 
> u32 bclp)
>   return ASLC_BACKLIGHT_FAILED;
>  
>   mutex_lock(>mode_config.mutex);
> - /*
> -  * Could match the OpRegion connector here instead, but we'd also need
> -  * to verify the connector could handle a backlight call.
> -  */
> - list_for_each_entry(encoder, >mode_config.encoder_list, head)
> - if (encoder->crtc == crtc) {
> - found = true;
> - break;
> - }
> -
> - if (!found) {
> - ret = ASLC_BACKLIGHT_FAILED;
> - goto out;
> - }
>  
> - list_for_each_entry(connector, >mod

Re: 3.13 i915 brightness settings broken when going from docked - undocked

2014-02-24 Thread Jesse Barnes
On Sun, 23 Feb 2014 10:50:59 -0500
Josh Boyer jwbo...@fedoraproject.org wrote:

 On Thu, Feb 20, 2014 at 07:31:41PM -0500, Josh Boyer wrote:
 On Wed, Feb 19, 2014 at 9:20 PM, Josh Boyer jwbo...@fedoraproject.org 
 wrote:
  Hi All,
 
  We've had a rather weird report[1] of the brightness adjustments being
  broken in a specific case with Thinkpad x220 hardware (SandyBridge
  based).  If you boot the machine with it in a dock and then undock,
  the brightness adjustments do not work.  That is with either the FN
  keys or the GNOME brightness slider.
 
  I can see that the value of
  /sys/class/backlight/acpi_video0/brightness increases/decreases but
  /sys/class/backlight/intel_backlight/brightness doesn't reflect any
  changes.  With 3.12 this works, and oddly with 3.14-rc1 it works
  (specifically, it starts working around v3.13-10231-g53d8ab2 which is
  right after the first DRM merge for 3.14).  With 3.13, if I undock and
  echo a higher value in the intel_backlight_brightness sysfs entry, the
  brightness will actually increase so it can be done manually, but it
  does not work as you'd expect.
 
  I'm in the middle of trying to do a reverse bisect for which patch
  fixes it in the 3.14-rcX series, but that's taking a while.  I thought
  I'd email and see if anyone already knows about this situation, what
  patch in 3.13 broke this, and which one then fixed it again.  Thus far
  all I've gathered is that backlight handling is confusing.
 
 The reverse bisect between 3.13 and 3.14-rc1 didn't prove fruitful,
 either because I messed it up or there's a combination of things that
 fix the issue.  So instead I did a regular git bisect between 3.12 and
 3.13 to see which commit _broke_ things and caused the above behavior.
  That landed me at:
 
 Author: Jesse Barnes jbar...@virtuousgeek.org
 Date:   Thu Oct 31 18:55:49 2013 +0200
 
 drm/i915: make backlight functions take a connector
 
 I have no idea if that makes sense or not, but it's at least something
 that seems to be in a relevant area of code.  Does anyone involved in
 that know why it would cause the above symptoms on 3.13, and which
 commit(s) fix it in 3.14-rc1?
 
 Since nobody is replying I poked around a bit myself.  The one commit
 that looks somewhat relevant in 3.14-rc1 seems to be:
 
 commit c91c9f32843a1b433de5a1ead4789a6bc8d3d914
 Author: Jani Nikula jani.nik...@intel.com
 Date:   Fri Nov 8 16:48:55 2013 +0200
 
 drm/i915: make asle notifications update backlight on all connectors
 
 That doesn't apply cleanly on 3.13 because of the other reworks that
 went in first, so I came up with the patch below.  It seems to fix it
 for my machine, but I'm waiting for confirmation from the original bug
 reporter still.  Maybe this will elicit some comments?
 
 josh
 
 Backport of upstream commit c91c9f328
 
 ---
  drivers/gpu/drm/i915/i915_drv.h   |  1 +
  drivers/gpu/drm/i915/intel_opregion.c | 31 ++-
  drivers/gpu/drm/i915/intel_panel.c|  4 
  3 files changed, 11 insertions(+), 25 deletions(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index 221ac62..d6d4349 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -1371,6 +1371,7 @@ typedef struct drm_i915_private {
  
   /* backlight */
   struct {
 + bool present;
   int level;
   bool enabled;
   spinlock_t lock; /* bl registers and the above bl fields */
 diff --git a/drivers/gpu/drm/i915/intel_opregion.c 
 b/drivers/gpu/drm/i915/intel_opregion.c
 index 6d69a9b..b2a51ae 100644
 --- a/drivers/gpu/drm/i915/intel_opregion.c
 +++ b/drivers/gpu/drm/i915/intel_opregion.c
 @@ -414,38 +414,19 @@ static u32 asle_set_backlight(struct drm_device *dev, 
 u32 bclp)
   return ASLC_BACKLIGHT_FAILED;
  
   mutex_lock(dev-mode_config.mutex);
 - /*
 -  * Could match the OpRegion connector here instead, but we'd also need
 -  * to verify the connector could handle a backlight call.
 -  */
 - list_for_each_entry(encoder, dev-mode_config.encoder_list, head)
 - if (encoder-crtc == crtc) {
 - found = true;
 - break;
 - }
 -
 - if (!found) {
 - ret = ASLC_BACKLIGHT_FAILED;
 - goto out;
 - }
  
 - list_for_each_entry(connector, dev-mode_config.connector_list, head)
 - if (connector-encoder == encoder)
 - intel_connector = to_intel_connector(connector);
 -
 - if (!intel_connector) {
 - ret = ASLC_BACKLIGHT_FAILED;
 - goto out;
 + DRM_DEBUG_KMS(updating opregion backlight %d/255\n, bclp);
 + list_for_each_entry(connector, dev-mode_config.connector_list, head) {
 + intel_connector = to_intel_connector(connector);
 + if (dev_priv-backlight.present)
 + intel_panel_set_backlight(intel_connector, bclp, 255

Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 18:28:12 +
One Thousand Gnomes  wrote:

> O> > According to the reg dumps, this is actually a MIPI panel we're failing
> > > to bring up properly.  We're working on that issue, but in the
> > > meantime, maybe something like the below would work for you?
> > 
> > A module paramater to fix a bug?  Ugh, that's almost worse than just
> > reverting the original patch, right?
> > 
> > Please don't do this, the distros will hate you even more than they
> > currently do :)
> 
> Can we just turn hot plug detection off if a panel is present for now ?

Looks like another option would be to add an 'e' to your forced boot
line.  That should prevent the detection stuff from running.  E.g. in
your case:
video=VGA-1:1366x768e

There's some info on this in Documentation/fb/modedb.txt

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 10:18:15 -0800
Greg KH  wrote:

> On Tue, Feb 11, 2014 at 10:11:37AM -0800, Jesse Barnes wrote:
> > On Tue, 11 Feb 2014 08:35:41 -0800
> > Greg KH  wrote:
> > 
> > > On Tue, Feb 11, 2014 at 02:22:03PM +, One Thousand Gnomes wrote:
> > > > On Tue, 14 Jan 2014 12:01:03 +
> > > > One Thousand Gnomes  wrote:
> > > > 
> > > > > On Sun, 12 Jan 2014 20:33:03 +0700
> > > > > Linus Torvalds  wrote:
> > > > > 
> > > > > > Another week, another RC. And things look fine.
> > > > > 
> > > > > It seems to hate Baytrail/T
> > > > > 
> > > > > My ASUS T100TA has gone from 3.11 'needs video=VGA-1:blah' to get the 
> > > > > mode
> > > > > right but otherwise running nicely and playing 3D games to 3.13-rc8
> > > > 
> > > > This has now been pinned down to (confirmed by multiple people)
> > > > 
> > > > commit 6c4a8962a4a078cacfc8eb5d4bd79f6343b8cd7a
> > > > Author: Jesse Barnes 
> > > > Date:   Tue Sep 10 14:54:42 2013 -0700
> > > > 
> > > > drm/i915/vlv: re-enable hotplug detect based probing on VLV/BYT
> > > > 
> > > > 
> > > > which given this is affecting some of the top 10 Amazon selling new
> > > > laptops and has not been fixed for 3.14-rc should IMHO be reverted until
> > > > such time as any needed debugging for the MIPI panels is done for this
> > > > driver and the hotplug probing doesn't break the workarounds.
> > > > 
> > > > Likewise it wants pulling from any -stable backports.
> > > 
> > > I only see it in the 3.13 tree, so when this gets reverted in Linus's
> > > tree, can someone mark it for stable so I know to pick it up for
> > > 3.13-stable as well?
> > 
> > Ugg, so reverting a real fix to make the T100TA work by accident...
> > this is ugly.
> > 
> > According to the reg dumps, this is actually a MIPI panel we're failing
> > to bring up properly.  We're working on that issue, but in the
> > meantime, maybe something like the below would work for you?
> 
> A module paramater to fix a bug?  Ugh, that's almost worse than just
> reverting the original patch, right?
> 
> Please don't do this, the distros will hate you even more than they
> currently do :)

Well reverting the real bug fix to get back to the "this happens to
work but totally by accident" is also pretty bad...

Another option would be to quirk this machine, but that's gross too.
Would be best to just fix the bug for real.

FWIW the kernel bug is here:
https://bugzilla.kernel.org/show_bug.cgi?id=68451

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 08:35:41 -0800
Greg KH  wrote:

> On Tue, Feb 11, 2014 at 02:22:03PM +, One Thousand Gnomes wrote:
> > On Tue, 14 Jan 2014 12:01:03 +
> > One Thousand Gnomes  wrote:
> > 
> > > On Sun, 12 Jan 2014 20:33:03 +0700
> > > Linus Torvalds  wrote:
> > > 
> > > > Another week, another RC. And things look fine.
> > > 
> > > It seems to hate Baytrail/T
> > > 
> > > My ASUS T100TA has gone from 3.11 'needs video=VGA-1:blah' to get the mode
> > > right but otherwise running nicely and playing 3D games to 3.13-rc8
> > 
> > This has now been pinned down to (confirmed by multiple people)
> > 
> > commit 6c4a8962a4a078cacfc8eb5d4bd79f6343b8cd7a
> > Author: Jesse Barnes 
> > Date:   Tue Sep 10 14:54:42 2013 -0700
> > 
> > drm/i915/vlv: re-enable hotplug detect based probing on VLV/BYT
> > 
> > 
> > which given this is affecting some of the top 10 Amazon selling new
> > laptops and has not been fixed for 3.14-rc should IMHO be reverted until
> > such time as any needed debugging for the MIPI panels is done for this
> > driver and the hotplug probing doesn't break the workarounds.
> > 
> > Likewise it wants pulling from any -stable backports.
> 
> I only see it in the 3.13 tree, so when this gets reverted in Linus's
> tree, can someone mark it for stable so I know to pick it up for
> 3.13-stable as well?

Ugg, so reverting a real fix to make the T100TA work by accident...
this is ugly.

According to the reg dumps, this is actually a MIPI panel we're failing
to bring up properly.  We're working on that issue, but in the
meantime, maybe something like the below would work for you?

-- 
Jesse Barnes, Intel Open Source Technology Center

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 61fb9fc..adc16a5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -155,6 +155,11 @@ module_param_named(prefault_disable, 
i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
"Disable page prefaulting for pread/pwrite/reloc 
(default:false). For developers only.");
 
+bool i915_force_vga_hotplug __read_mostly;
+module_param_named(force_vga_hotplug, i915_force_vga_hotplug, bool, 0600);
+MODULE_PARM_DESC(force_vga_hotplug,
+"Force VGA hotplug detection to always return true (default: 
false)");
+
 static struct drm_driver driver;
 
 static const struct intel_device_info intel_i830_info = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 29e1e86..f7a5e98 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1918,6 +1918,7 @@ extern bool i915_fastboot __read_mostly;
 extern int i915_enable_pc8 __read_mostly;
 extern int i915_pc8_timeout __read_mostly;
 extern bool i915_prefault_disable __read_mostly;
+extern bool i915_force_vga_hotplug __read_mostly;
 
 extern int i915_suspend(struct drm_device *dev, pm_message_t state);
 extern int i915_resume(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c
index e2e39e6..55531d6 100644
--- a/drivers/gpu/drm/i915/intel_crt.c
+++ b/drivers/gpu/drm/i915/intel_crt.c
@@ -434,6 +434,9 @@ static bool intel_crt_detect_hotplug(struct drm_connector 
*connector)
/* and put the bits back */
I915_WRITE(PORT_HOTPLUG_EN, orig);
 
+   if (i915_force_vga_hotplug)
+   ret = true;
+
return ret;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 08:35:41 -0800
Greg KH gre...@linuxfoundation.org wrote:

 On Tue, Feb 11, 2014 at 02:22:03PM +, One Thousand Gnomes wrote:
  On Tue, 14 Jan 2014 12:01:03 +
  One Thousand Gnomes gno...@lxorguk.ukuu.org.uk wrote:
  
   On Sun, 12 Jan 2014 20:33:03 +0700
   Linus Torvalds torva...@linux-foundation.org wrote:
   
Another week, another RC. And things look fine.
   
   It seems to hate Baytrail/T
   
   My ASUS T100TA has gone from 3.11 'needs video=VGA-1:blah' to get the mode
   right but otherwise running nicely and playing 3D games to 3.13-rc8
  
  This has now been pinned down to (confirmed by multiple people)
  
  commit 6c4a8962a4a078cacfc8eb5d4bd79f6343b8cd7a
  Author: Jesse Barnes jbar...@virtuousgeek.org
  Date:   Tue Sep 10 14:54:42 2013 -0700
  
  drm/i915/vlv: re-enable hotplug detect based probing on VLV/BYT
  
  
  which given this is affecting some of the top 10 Amazon selling new
  laptops and has not been fixed for 3.14-rc should IMHO be reverted until
  such time as any needed debugging for the MIPI panels is done for this
  driver and the hotplug probing doesn't break the workarounds.
  
  Likewise it wants pulling from any -stable backports.
 
 I only see it in the 3.13 tree, so when this gets reverted in Linus's
 tree, can someone mark it for stable so I know to pick it up for
 3.13-stable as well?

Ugg, so reverting a real fix to make the T100TA work by accident...
this is ugly.

According to the reg dumps, this is actually a MIPI panel we're failing
to bring up properly.  We're working on that issue, but in the
meantime, maybe something like the below would work for you?

-- 
Jesse Barnes, Intel Open Source Technology Center

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 61fb9fc..adc16a5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -155,6 +155,11 @@ module_param_named(prefault_disable, 
i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
Disable page prefaulting for pread/pwrite/reloc 
(default:false). For developers only.);
 
+bool i915_force_vga_hotplug __read_mostly;
+module_param_named(force_vga_hotplug, i915_force_vga_hotplug, bool, 0600);
+MODULE_PARM_DESC(force_vga_hotplug,
+Force VGA hotplug detection to always return true (default: 
false));
+
 static struct drm_driver driver;
 
 static const struct intel_device_info intel_i830_info = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 29e1e86..f7a5e98 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1918,6 +1918,7 @@ extern bool i915_fastboot __read_mostly;
 extern int i915_enable_pc8 __read_mostly;
 extern int i915_pc8_timeout __read_mostly;
 extern bool i915_prefault_disable __read_mostly;
+extern bool i915_force_vga_hotplug __read_mostly;
 
 extern int i915_suspend(struct drm_device *dev, pm_message_t state);
 extern int i915_resume(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c
index e2e39e6..55531d6 100644
--- a/drivers/gpu/drm/i915/intel_crt.c
+++ b/drivers/gpu/drm/i915/intel_crt.c
@@ -434,6 +434,9 @@ static bool intel_crt_detect_hotplug(struct drm_connector 
*connector)
/* and put the bits back */
I915_WRITE(PORT_HOTPLUG_EN, orig);
 
+   if (i915_force_vga_hotplug)
+   ret = true;
+
return ret;
 }
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 10:18:15 -0800
Greg KH gre...@linuxfoundation.org wrote:

 On Tue, Feb 11, 2014 at 10:11:37AM -0800, Jesse Barnes wrote:
  On Tue, 11 Feb 2014 08:35:41 -0800
  Greg KH gre...@linuxfoundation.org wrote:
  
   On Tue, Feb 11, 2014 at 02:22:03PM +, One Thousand Gnomes wrote:
On Tue, 14 Jan 2014 12:01:03 +
One Thousand Gnomes gno...@lxorguk.ukuu.org.uk wrote:

 On Sun, 12 Jan 2014 20:33:03 +0700
 Linus Torvalds torva...@linux-foundation.org wrote:
 
  Another week, another RC. And things look fine.
 
 It seems to hate Baytrail/T
 
 My ASUS T100TA has gone from 3.11 'needs video=VGA-1:blah' to get the 
 mode
 right but otherwise running nicely and playing 3D games to 3.13-rc8

This has now been pinned down to (confirmed by multiple people)

commit 6c4a8962a4a078cacfc8eb5d4bd79f6343b8cd7a
Author: Jesse Barnes jbar...@virtuousgeek.org
Date:   Tue Sep 10 14:54:42 2013 -0700

drm/i915/vlv: re-enable hotplug detect based probing on VLV/BYT


which given this is affecting some of the top 10 Amazon selling new
laptops and has not been fixed for 3.14-rc should IMHO be reverted until
such time as any needed debugging for the MIPI panels is done for this
driver and the hotplug probing doesn't break the workarounds.

Likewise it wants pulling from any -stable backports.
   
   I only see it in the 3.13 tree, so when this gets reverted in Linus's
   tree, can someone mark it for stable so I know to pick it up for
   3.13-stable as well?
  
  Ugg, so reverting a real fix to make the T100TA work by accident...
  this is ugly.
  
  According to the reg dumps, this is actually a MIPI panel we're failing
  to bring up properly.  We're working on that issue, but in the
  meantime, maybe something like the below would work for you?
 
 A module paramater to fix a bug?  Ugh, that's almost worse than just
 reverting the original patch, right?
 
 Please don't do this, the distros will hate you even more than they
 currently do :)

Well reverting the real bug fix to get back to the this happens to
work but totally by accident is also pretty bad...

Another option would be to quirk this machine, but that's gross too.
Would be best to just fix the bug for real.

FWIW the kernel bug is here:
https://bugzilla.kernel.org/show_bug.cgi?id=68451

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Baytrail/T (ASUS T100 etc) regression from 3.13 onwards

2014-02-11 Thread Jesse Barnes
On Tue, 11 Feb 2014 18:28:12 +
One Thousand Gnomes gno...@lxorguk.ukuu.org.uk wrote:

 O  According to the reg dumps, this is actually a MIPI panel we're failing
   to bring up properly.  We're working on that issue, but in the
   meantime, maybe something like the below would work for you?
  
  A module paramater to fix a bug?  Ugh, that's almost worse than just
  reverting the original patch, right?
  
  Please don't do this, the distros will hate you even more than they
  currently do :)
 
 Can we just turn hot plug detection off if a panel is present for now ?

Looks like another option would be to add an 'e' to your forced boot
line.  That should prevent the detection stuff from running.  E.g. in
your case:
video=VGA-1:1366x768e

There's some info on this in Documentation/fb/modedb.txt

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume regression in 5a87182aa21d6d5d306840feab9321818dd3e2a3

2013-12-11 Thread Jesse Barnes
On Thu, 12 Dec 2013 02:22:52 +0100
"Rafael J. Wysocki"  wrote:

> On Wednesday, December 11, 2013 04:38:12 PM Jesse Barnes wrote:
> > On one of my BYT systems, I see a hang at resume.  Based on my bisect,
> > it looks like this commit is the culprit:
> > 
> > commit 5a87182aa21d6d5d306840feab9321818dd3e2a3
> > Author: Viresh Kumar 
> > Date:   Wed Nov 27 09:09:42 2013 +0530
> > 
> > cpufreq: suspend governors on system suspend/hibernate
> > 
> > Is this a known issue?
> 
> Yes.
> 
> > If so, is there a fix somewhere already?
> 
> The commit has been reverted (in the Linus' tree).

Great, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Resume regression in 5a87182aa21d6d5d306840feab9321818dd3e2a3

2013-12-11 Thread Jesse Barnes
On one of my BYT systems, I see a hang at resume.  Based on my bisect,
it looks like this commit is the culprit:

commit 5a87182aa21d6d5d306840feab9321818dd3e2a3
Author: Viresh Kumar 
Date:   Wed Nov 27 09:09:42 2013 +0530

cpufreq: suspend governors on system suspend/hibernate

Is this a known issue?  If so, is there a fix somewhere already?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Resume regression in 5a87182aa21d6d5d306840feab9321818dd3e2a3

2013-12-11 Thread Jesse Barnes
On one of my BYT systems, I see a hang at resume.  Based on my bisect,
it looks like this commit is the culprit:

commit 5a87182aa21d6d5d306840feab9321818dd3e2a3
Author: Viresh Kumar viresh.ku...@linaro.org
Date:   Wed Nov 27 09:09:42 2013 +0530

cpufreq: suspend governors on system suspend/hibernate

Is this a known issue?  If so, is there a fix somewhere already?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume regression in 5a87182aa21d6d5d306840feab9321818dd3e2a3

2013-12-11 Thread Jesse Barnes
On Thu, 12 Dec 2013 02:22:52 +0100
Rafael J. Wysocki r...@rjwysocki.net wrote:

 On Wednesday, December 11, 2013 04:38:12 PM Jesse Barnes wrote:
  On one of my BYT systems, I see a hang at resume.  Based on my bisect,
  it looks like this commit is the culprit:
  
  commit 5a87182aa21d6d5d306840feab9321818dd3e2a3
  Author: Viresh Kumar viresh.ku...@linaro.org
  Date:   Wed Nov 27 09:09:42 2013 +0530
  
  cpufreq: suspend governors on system suspend/hibernate
  
  Is this a known issue?
 
 Yes.
 
  If so, is there a fix somewhere already?
 
 The commit has been reverted (in the Linus' tree).

Great, thanks.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Regression: hang on BYT with new pstate support

2013-12-06 Thread Jesse Barnes
The drm-intel tree just updated to 3.13-rc, and I started seeing early
hard hangs on my BYT system.  I bisected it down to this commit:

commit 19e77c28dbf1972305da0dfeb92a62f83df3a91d
Author: Dirk Brandewie 
Date:   Mon Oct 21 09:20:35 2013 -0700

intel_pstate: Add Baytrail support

Add support for the Baytrail processor.

The hangs occur at different times, so unless the get/set functions
aren't called until we change the state, the problem may be in the
core_set_pstate function instead?  Maybe it's not suited to BYT somehow?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Regression: hang on BYT with new pstate support

2013-12-06 Thread Jesse Barnes
The drm-intel tree just updated to 3.13-rc, and I started seeing early
hard hangs on my BYT system.  I bisected it down to this commit:

commit 19e77c28dbf1972305da0dfeb92a62f83df3a91d
Author: Dirk Brandewie dirk.j.brande...@intel.com
Date:   Mon Oct 21 09:20:35 2013 -0700

intel_pstate: Add Baytrail support

Add support for the Baytrail processor.

The hangs occur at different times, so unless the get/set functions
aren't called until we change the state, the problem may be in the
core_set_pstate function instead?  Maybe it's not suited to BYT somehow?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86/early quirk: use gen6 stolen detection for VLV

2013-11-12 Thread Jesse Barnes
We've always been able to use either method on VLV, but it appears more
recent BIOSes only support the gen6 method, so switch over to that.

References: https://bugs.freedesktop.org/show_bug.cgi?id=71370
Signed-off-by: Jesse Barnes 
---
 arch/x86/kernel/early-quirks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 96f958d..bc4a088 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -330,8 +330,8 @@ static struct pci_device_id intel_stolen_ids[] __initdata = 
{
INTEL_I915GM_IDS(gen3_stolen_size),
INTEL_I945G_IDS(gen3_stolen_size),
INTEL_I945GM_IDS(gen3_stolen_size),
-   INTEL_VLV_M_IDS(gen3_stolen_size),
-   INTEL_VLV_D_IDS(gen3_stolen_size),
+   INTEL_VLV_M_IDS(gen6_stolen_size),
+   INTEL_VLV_D_IDS(gen6_stolen_size),
INTEL_PINEVIEW_IDS(gen3_stolen_size),
INTEL_I965G_IDS(gen3_stolen_size),
INTEL_G33_IDS(gen3_stolen_size),
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86/early quirk: use gen6 stolen detection for VLV

2013-11-12 Thread Jesse Barnes
We've always been able to use either method on VLV, but it appears more
recent BIOSes only support the gen6 method, so switch over to that.

References: https://bugs.freedesktop.org/show_bug.cgi?id=71370
Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 arch/x86/kernel/early-quirks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 96f958d..bc4a088 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -330,8 +330,8 @@ static struct pci_device_id intel_stolen_ids[] __initdata = 
{
INTEL_I915GM_IDS(gen3_stolen_size),
INTEL_I945G_IDS(gen3_stolen_size),
INTEL_I945GM_IDS(gen3_stolen_size),
-   INTEL_VLV_M_IDS(gen3_stolen_size),
-   INTEL_VLV_D_IDS(gen3_stolen_size),
+   INTEL_VLV_M_IDS(gen6_stolen_size),
+   INTEL_VLV_D_IDS(gen6_stolen_size),
INTEL_PINEVIEW_IDS(gen3_stolen_size),
INTEL_I965G_IDS(gen3_stolen_size),
INTEL_G33_IDS(gen3_stolen_size),
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] drm tree for 3.12-rc1

2013-09-05 Thread Jesse Barnes
On Thu, 5 Sep 2013 12:18:32 -0700
Linus Torvalds  wrote:

> On Thu, Sep 5, 2013 at 3:41 AM, Dave Airlie  wrote:
> >
> > i915: Haswell PC8+ support and eLLC support, HDMI 4K support, initial 
> > per-process VMA pieces,
> >   watermark reworks, convert to generic hdmi infoframes, encoder 
> > reworking, fastboot support,
> 
> Hmm.
> 
> The first time I booted this, I just got a black screen on my Haswell
> desktop when X11 started up.  I could ctrl-alt-BS and ctrl-alt-del to
> reboot the machine, and neither the Xorg.0.log nor the dmesg contained
> anything interesting.

Did the console come back after ctl-alt-bs?  Or was it just a blind
reboot?  Troubling that it doesn't happen again...

> I was about to try to bisect it, but decided to see how repeatable it
> was, and it didn't happen again. But it also hasn't ever happened
> before, so I'm a bit worried.
> 
> This is with the DP cable, which has made my other Haswell issues go away.
> 
> I'm also a bit bummed that hw acceleration of video doesn't seem to
> work on Haswell, meaning that full-screen is now a jerky mess. I fear
> that that is user-space libraries/X.org, but I thought I'd mention it
> in the hope of getting a "oh, it's working for us, you'll get a fix
> for it soon".

AFAIK we have libva support out there for HSW.  The trick is getting
your stack to actually use it.  Gwenole or Sean may be able to help.

> Because my shiny new 65W haswell is really nice and does a "make
> allmodconfig" in half the time of my old machine, but the GPU side has
> been something of a step backwards...

Well we definitely don't want that...

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] drm tree for 3.12-rc1

2013-09-05 Thread Jesse Barnes
On Thu, 5 Sep 2013 12:18:32 -0700
Linus Torvalds torva...@linux-foundation.org wrote:

 On Thu, Sep 5, 2013 at 3:41 AM, Dave Airlie airl...@linux.ie wrote:
 
  i915: Haswell PC8+ support and eLLC support, HDMI 4K support, initial 
  per-process VMA pieces,
watermark reworks, convert to generic hdmi infoframes, encoder 
  reworking, fastboot support,
 
 Hmm.
 
 The first time I booted this, I just got a black screen on my Haswell
 desktop when X11 started up.  I could ctrl-alt-BS and ctrl-alt-del to
 reboot the machine, and neither the Xorg.0.log nor the dmesg contained
 anything interesting.

Did the console come back after ctl-alt-bs?  Or was it just a blind
reboot?  Troubling that it doesn't happen again...

 I was about to try to bisect it, but decided to see how repeatable it
 was, and it didn't happen again. But it also hasn't ever happened
 before, so I'm a bit worried.
 
 This is with the DP cable, which has made my other Haswell issues go away.
 
 I'm also a bit bummed that hw acceleration of video doesn't seem to
 work on Haswell, meaning that full-screen is now a jerky mess. I fear
 that that is user-space libraries/X.org, but I thought I'd mention it
 in the hope of getting a oh, it's working for us, you'll get a fix
 for it soon.

AFAIK we have libva support out there for HSW.  The trick is getting
your stack to actually use it.  Gwenole or Sean may be able to help.

 Because my shiny new 65W haswell is really nice and does a make
 allmodconfig in half the time of my old machine, but the GPU side has
 been something of a step backwards...

Well we definitely don't want that...

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] intel_ips: blacklist ASUSTek G60JX laptops

2013-08-29 Thread Jesse Barnes
On Thu, 29 Aug 2013 12:18:14 -0400
Joseph Salisbury  wrote:

> On 08/14/2013 05:11 PM, Linus Torvalds wrote:
> > On Wed, Aug 14, 2013 at 9:38 AM, Jesse Barnes  
> > wrote:
> >> Linus, you may want to pick this up directly, as I'm not sure if
> >> Matthew is still looking after the x86 drivers these days.
> > Can't we make the simpler patch be to just not spam the logs? Do it
> > once, and forget about it. Maybe make the timeouts long enough to make
> > sure that there are no other downsides from having the driver
> > occasionally testing if it's back?
> >
> > I *detest* hardware-specific blacklists. They are impossible to
> > maintain, and we've had the situation more than once that we fixed the
> > real bug that caused the blacklist in the first place, but the entry
> > stays around because nobody knows/cares/tests it.
> >
> >   Linus
> Hi Jesse,
> 
> What are your thoughts on alternatives to using a blacklist?  Is there a
> bit we can read to determine if IPS is supported on a platform?
> 
> If not, can we do something like the following?  Basically increase the
> timeout to 30 seconds then return an error from the monitor if there was
> no update in that time:
> 
> diff --git a/drivers/platform/x86/intel_ips.c
> b/drivers/platform/x86/intel_ips.c
> index 18dcb58..a2391f7 100644
> --- a/drivers/platform/x86/intel_ips.c
> +++ b/drivers/platform/x86/intel_ips.c
> @@ -1095,8 +1095,15 @@ static int ips_monitor(void *data)
> cur_seqno = (thm_readl(THM_ITV) & ITV_ME_SEQNO_MASK) >>
> ITV_ME_SEQNO_SHIFT;
> if (cur_seqno == last_seqno &&
> -   time_after(jiffies, seqno_timestamp + HZ)) {
> -   dev_warn(>dev->dev, "ME failed to update
> for more than 1s, likely hung\n");
> +   time_after(jiffies, seqno_timestamp + (HZ*30))) {
> +   /* It's been 30s.  ME is likely hung.  Print
> message and return. */
> +   dev_warn(>dev->dev, "ME failed to update
> for more than 30s, likely hung so exiting\n");
> +   del_timer_sync();
> +   destroy_timer_on_stack();
> +
> +   dev_dbg(>dev->dev, "ips-monitor thread
> stopped\n");
> +   return -1;
> +
> } else {
> seqno_timestamp = get_jiffies_64();
> last_seqno = cur_seqno;
> 
> There may be more to include such as restarting the monitor later to see
> if the ME ever came back, just looking for other ideas and/or suggestions.

Yeah, something like that might be ok.  But ideally we'd detect whether
an ME was present at all like Matthew says.

I'm not sure how to do that, but maybe Anas does (or can point us to
someone else who would know).

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] intel_ips: blacklist ASUSTek G60JX laptops

2013-08-29 Thread Jesse Barnes
On Thu, 29 Aug 2013 12:18:14 -0400
Joseph Salisbury joseph.salisb...@canonical.com wrote:

 On 08/14/2013 05:11 PM, Linus Torvalds wrote:
  On Wed, Aug 14, 2013 at 9:38 AM, Jesse Barnes jbar...@virtuousgeek.org 
  wrote:
  Linus, you may want to pick this up directly, as I'm not sure if
  Matthew is still looking after the x86 drivers these days.
  Can't we make the simpler patch be to just not spam the logs? Do it
  once, and forget about it. Maybe make the timeouts long enough to make
  sure that there are no other downsides from having the driver
  occasionally testing if it's back?
 
  I *detest* hardware-specific blacklists. They are impossible to
  maintain, and we've had the situation more than once that we fixed the
  real bug that caused the blacklist in the first place, but the entry
  stays around because nobody knows/cares/tests it.
 
Linus
 Hi Jesse,
 
 What are your thoughts on alternatives to using a blacklist?  Is there a
 bit we can read to determine if IPS is supported on a platform?
 
 If not, can we do something like the following?  Basically increase the
 timeout to 30 seconds then return an error from the monitor if there was
 no update in that time:
 
 diff --git a/drivers/platform/x86/intel_ips.c
 b/drivers/platform/x86/intel_ips.c
 index 18dcb58..a2391f7 100644
 --- a/drivers/platform/x86/intel_ips.c
 +++ b/drivers/platform/x86/intel_ips.c
 @@ -1095,8 +1095,15 @@ static int ips_monitor(void *data)
 cur_seqno = (thm_readl(THM_ITV)  ITV_ME_SEQNO_MASK) 
 ITV_ME_SEQNO_SHIFT;
 if (cur_seqno == last_seqno 
 -   time_after(jiffies, seqno_timestamp + HZ)) {
 -   dev_warn(ips-dev-dev, ME failed to update
 for more than 1s, likely hung\n);
 +   time_after(jiffies, seqno_timestamp + (HZ*30))) {
 +   /* It's been 30s.  ME is likely hung.  Print
 message and return. */
 +   dev_warn(ips-dev-dev, ME failed to update
 for more than 30s, likely hung so exiting\n);
 +   del_timer_sync(timer);
 +   destroy_timer_on_stack(timer);
 +
 +   dev_dbg(ips-dev-dev, ips-monitor thread
 stopped\n);
 +   return -1;
 +
 } else {
 seqno_timestamp = get_jiffies_64();
 last_seqno = cur_seqno;
 
 There may be more to include such as restarting the monitor later to see
 if the ME ever came back, just looking for other ideas and/or suggestions.

Yeah, something like that might be ok.  But ideally we'd detect whether
an ME was present at all like Matthew says.

I'm not sure how to do that, but maybe Anas does (or can point us to
someone else who would know).

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] intel_ips: blacklist ASUSTek G60JX laptops

2013-08-14 Thread Jesse Barnes
On Wed, 14 Aug 2013 12:22:49 -0400
Joseph Salisbury  wrote:

> BugLink: http://bugs.launchpad.net/bugs/1210848
> 
> On an ASUSTek G60JX laptop, the intel_ips driver spams the log with a warning 
> message: "ME failed to update for more than 1s, likely hung".  This ME 
> doesn't support the feature, so requesting it be blacklisted for now.   
> 
> Signed-off-by: Joseph Salisbury 
> Tested-by: Nick Jenkins  
> Cc: Jesse Barnes 
> Cc: platform-driver-...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
>  drivers/platform/x86/intel_ips.c |8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/platform/x86/intel_ips.c 
> b/drivers/platform/x86/intel_ips.c
> index 18dcb58..6d8c5e0 100644
> --- a/drivers/platform/x86/intel_ips.c
> +++ b/drivers/platform/x86/intel_ips.c
> @@ -1501,6 +1501,14 @@ static const struct dmi_system_id ips_blacklist[] = {
>   DMI_MATCH(DMI_PRODUCT_NAME, "HP ProBook"),
>   },
>   },
> + {
> + .callback = ips_blacklist_callback,
> + .ident = "G60JX",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK Computer Inc."),
> + DMI_MATCH(DMI_PRODUCT_NAME, "G60JX"),
> + },
> + },
>   { } /* terminating entry */
>  };
>  

Acked-by: Jesse Barnes 

Linus, you may want to pick this up directly, as I'm not sure if
Matthew is still looking after the x86 drivers these days.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] intel_ips: blacklist ASUSTek G60JX laptops

2013-08-14 Thread Jesse Barnes
On Wed, 14 Aug 2013 12:22:49 -0400
Joseph Salisbury joseph.salisb...@canonical.com wrote:

 BugLink: http://bugs.launchpad.net/bugs/1210848
 
 On an ASUSTek G60JX laptop, the intel_ips driver spams the log with a warning 
 message: ME failed to update for more than 1s, likely hung.  This ME 
 doesn't support the feature, so requesting it be blacklisted for now.   
 
 Signed-off-by: Joseph Salisbury joseph.salisb...@canonical.com
 Tested-by: Nick Jenkins tech.crew.jenk...@gmail.com 
 Cc: Jesse Barnes jbar...@virtuousgeek.org
 Cc: platform-driver-...@vger.kernel.org
 Cc: linux-kernel@vger.kernel.org
 Cc: sta...@vger.kernel.org
 ---
  drivers/platform/x86/intel_ips.c |8 
  1 file changed, 8 insertions(+)
 
 diff --git a/drivers/platform/x86/intel_ips.c 
 b/drivers/platform/x86/intel_ips.c
 index 18dcb58..6d8c5e0 100644
 --- a/drivers/platform/x86/intel_ips.c
 +++ b/drivers/platform/x86/intel_ips.c
 @@ -1501,6 +1501,14 @@ static const struct dmi_system_id ips_blacklist[] = {
   DMI_MATCH(DMI_PRODUCT_NAME, HP ProBook),
   },
   },
 + {
 + .callback = ips_blacklist_callback,
 + .ident = G60JX,
 + .matches = {
 + DMI_MATCH(DMI_SYS_VENDOR, ASUSTeK Computer Inc.),
 + DMI_MATCH(DMI_PRODUCT_NAME, G60JX),
 + },
 + },
   { } /* terminating entry */
  };
  

Acked-by: Jesse Barnes jbar...@virtuousgeek.org

Linus, you may want to pick this up directly, as I'm not sure if
Matthew is still looking after the x86 drivers these days.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drm/i915: add fast boot support for Haswell

2013-08-05 Thread Jesse Barnes
On Thu,  1 Aug 2013 14:12:22 -0700
Furquan Shaikh  wrote:

> @@ -1282,6 +1283,13 @@ static void intel_ddi_get_config(struct intel_encoder 
> *encoder,
>   flags |= DRM_MODE_FLAG_NVSYNC;
>  
>   pipe_config->adjusted_mode.flags |= flags;
> +
> + if (port == PORT_A) {
> + if ((I915_READ(DP_A) & DP_PLL_FREQ_MASK) == DP_PLL_FREQ_160MHZ)
> + pipe_config->port_clock = 162000;
> + else
> + pipe_config->port_clock = 27;
> + }

Sorry Furquan, I should have checked this earlier, I knew it was too
good to be true. :)

On HSW, DP_A is actually DDI_BUF_CTL, and it has a different layout
than the old DP_A reg.  Like Daniel said, doing it the old way is
invalid on HSW.  It might work in your configs, but I think that's just
coincidence, since bit 16 is the port reversal bit on HSW, not the
clock freq.

To get the clock freq, you need to look at 0x45020 to find the refclk,
then look at the WRPLL_CTL for the pipe to get the dividers.  That's
what Daniel meant when he asked for a full "clock_get" function.  It's
only a little more complicated, but you'll need docs for it.  Charlie
Huang ought to be able to get you the NDA docs that should have the
info you need.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drm/i915: add fast boot support for Haswell

2013-08-05 Thread Jesse Barnes
On Thu,  1 Aug 2013 14:12:22 -0700
Furquan Shaikh furq...@google.com wrote:

 @@ -1282,6 +1283,13 @@ static void intel_ddi_get_config(struct intel_encoder 
 *encoder,
   flags |= DRM_MODE_FLAG_NVSYNC;
  
   pipe_config-adjusted_mode.flags |= flags;
 +
 + if (port == PORT_A) {
 + if ((I915_READ(DP_A)  DP_PLL_FREQ_MASK) == DP_PLL_FREQ_160MHZ)
 + pipe_config-port_clock = 162000;
 + else
 + pipe_config-port_clock = 27;
 + }

Sorry Furquan, I should have checked this earlier, I knew it was too
good to be true. :)

On HSW, DP_A is actually DDI_BUF_CTL, and it has a different layout
than the old DP_A reg.  Like Daniel said, doing it the old way is
invalid on HSW.  It might work in your configs, but I think that's just
coincidence, since bit 16 is the port reversal bit on HSW, not the
clock freq.

To get the clock freq, you need to look at 0x45020 to find the refclk,
then look at the WRPLL_CTL for the pipe to get the dividers.  That's
what Daniel meant when he asked for a full clock_get function.  It's
only a little more complicated, but you'll need docs for it.  Charlie
Huang ought to be able to get you the NDA docs that should have the
info you need.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v5

2013-07-26 Thread Jesse Barnes
Systems with Intel graphics controllers set aside memory exclusively for
gfx driver use.  This memory is not always marked in the E820 as
reserved or as RAM, and so is subject to overlap from E820 manipulation
later in the boot process.  On some systems, MMIO space is allocated on
top, despite the efforts of the "RAM buffer" approach, which simply
rounds memory boundaries up to 64M to try to catch space that may decode
as RAM and so is not suitable for MMIO.

v2: use read_pci_config for 32 bit reads instead of adding a new one
(Chris)
add gen6 stolen size function (Chris)
v3: use a function pointer (Chris)
drop gen2 bits (Daniel)
v4: call e820_sanitize_map after adding the region
v5: fixup comments (Peter)
simplify loop (Chris)

Acked-by: Ingo Molnar 
Signed-off-by: Jesse Barnes 
---
 arch/x86/kernel/early-quirks.c  |  154 +++
 drivers/gpu/drm/i915/i915_reg.h |   15 
 include/drm/i915_drm.h  |   32 
 3 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 94ab6b9..580080c 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -208,6 +209,157 @@ static void __init intel_remapping_check(int num, int 
slot, int func)
 
 }
 
+/*
+ * Systems with Intel graphics controllers set aside memory exclusively
+ * for gfx driver use.  This memory is not marked in the E820 as reserved
+ * or as RAM, and so is subject to overlap from E820 manipulation later
+ * in the boot process.  On some systems, MMIO space is allocated on top,
+ * despite the efforts of the "RAM buffer" approach, which simply rounds
+ * memory boundaries up to 64M to try to catch space that may decode
+ * as RAM and so is not suitable for MMIO.
+ *
+ * And yes, so far on current devices the base addr is always under 4G.
+ */
+static u32 __init intel_stolen_base(int num, int slot, int func)
+{
+   u32 base;
+
+   /*
+* For the PCI IDs in this quirk, the stolen base is always
+* in 0x5c, aka the BDSM register (yes that's really what
+* it's called).
+*/
+   base = read_pci_config(num, slot, func, 0x5c);
+   base &= ~((1<<20) - 1);
+
+   return base;
+}
+
+#define KB(x)  ((x) * 1024)
+#define MB(x)  (KB (KB (x)))
+#define GB(x)  (MB (KB (x)))
+
+static size_t __init gen3_stolen_size(int num, int slot, int func)
+{
+   size_t stolen_size;
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(0, 0, 0, I830_GMCH_CTRL);
+
+   switch (gmch_ctrl & I855_GMCH_GMS_MASK) {
+   case I855_GMCH_GMS_STOLEN_1M:
+   stolen_size = MB(1);
+   break;
+   case I855_GMCH_GMS_STOLEN_4M:
+   stolen_size = MB(4);
+   break;
+   case I855_GMCH_GMS_STOLEN_8M:
+   stolen_size = MB(8);
+   break;
+   case I855_GMCH_GMS_STOLEN_16M:
+   stolen_size = MB(16);
+   break;
+   case I855_GMCH_GMS_STOLEN_32M:
+   stolen_size = MB(32);
+   break;
+   case I915_GMCH_GMS_STOLEN_48M:
+   stolen_size = MB(48);
+   break;
+   case I915_GMCH_GMS_STOLEN_64M:
+   stolen_size = MB(64);
+   break;
+   case G33_GMCH_GMS_STOLEN_128M:
+   stolen_size = MB(128);
+   break;
+   case G33_GMCH_GMS_STOLEN_256M:
+   stolen_size = MB(256);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_96M:
+   stolen_size = MB(96);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_160M:
+   stolen_size = MB(160);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_224M:
+   stolen_size = MB(224);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_352M:
+   stolen_size = MB(352);
+   break;
+   default:
+   stolen_size = 0;
+   break;
+   }
+
+   return stolen_size;
+}
+
+static size_t __init gen6_stolen_size(int num, int slot, int func)
+{
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL);
+   gmch_ctrl >>= SNB_GMCH_GMS_SHIFT;
+   gmch_ctrl &= SNB_GMCH_GMS_MASK;
+
+   return gmch_ctrl << 25; /* 32 MB units */
+}
+
+typedef size_t (*stolen_size_fn)(int num, int slot, int func);
+
+static struct pci_device_id intel_stolen_ids[] __initdata = {
+   INTEL_I915G_IDS(gen3_stolen_size),
+   INTEL_I915GM_IDS(gen3_stolen_size),
+   INTEL_I945G_IDS(gen3_stolen_size),
+   INTEL_I945GM_IDS(gen3_stolen_size),
+   INTEL_VLV_M_IDS(gen3_stolen_size),
+   INTEL_VLV_D_IDS(gen3_stolen_size),
+   INTEL_PINEVIEW_IDS(gen3_stolen_size),
+   INTEL_I965G_IDS(gen3_stolen_size),
+   INTEL_G33_IDS(gen3_stole

[PATCH 1/2] drm/i915: split PCI IDs out into i915_drm.h v4

2013-07-26 Thread Jesse Barnes
For use by userspace (at some point in the future) and other kernel code.

v2: move PCI IDs to uabi (Chris)
move PCI IDs to drm/ (Dave)
v3: fixup Quanta detection - needs to come first (Daniel)
v4: fix up PCI match structure init for easier use by userspace (Chris)

Signed-off-by: Jesse Barnes 
---
 drivers/gpu/drm/i915/i915_drv.c |  164 +++---
 include/drm/i915_drm.h  |2 +
 include/drm/i915_pciids.h   |  211 +++
 3 files changed, 247 insertions(+), 130 deletions(-)
 create mode 100644 include/drm/i915_pciids.h

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b07362f..e87bccf 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -140,25 +140,6 @@ MODULE_PARM_DESC(fastboot, "Try to skip unnecessary mode 
sets at boot time "
 static struct drm_driver driver;
 extern int intel_agp_enabled;
 
-#define INTEL_VGA_DEVICE(id, info) {   \
-   .class = PCI_BASE_CLASS_DISPLAY << 16,  \
-   .class_mask = 0xff, \
-   .vendor = 0x8086,   \
-   .device = id,   \
-   .subvendor = PCI_ANY_ID,\
-   .subdevice = PCI_ANY_ID,\
-   .driver_data = (unsigned long) info }
-
-#define INTEL_QUANTA_VGA_DEVICE(info) {\
-   .class = PCI_BASE_CLASS_DISPLAY << 16,  \
-   .class_mask = 0xff, \
-   .vendor = 0x8086,   \
-   .device = 0x16a,\
-   .subvendor = 0x152d,\
-   .subdevice = 0x8990,\
-   .driver_data = (unsigned long) info }
-
-
 static const struct intel_device_info intel_i830_info = {
.gen = 2, .is_mobile = 1, .cursor_needs_physical = 1, .num_pipes = 2,
.has_overlay = 1, .overlay_needs_physical = 1,
@@ -333,118 +314,41 @@ static const struct intel_device_info 
intel_haswell_m_info = {
.has_vebox_ring = 1,
 };
 
+/*
+ * Make sure any device matches here are from most specific to most
+ * general.  For example, since the Quanta match is based on the subsystem
+ * and subvendor IDs, we need it to come before the more general IVB
+ * PCI ID matches, otherwise we'll use the wrong info struct above.
+ */
+#define INTEL_PCI_IDS \
+   INTEL_I830_IDS(_i830_info),   \
+   INTEL_I845G_IDS(_845g_info),  \
+   INTEL_I85X_IDS(_i85x_info),   \
+   INTEL_I865G_IDS(_i865g_info), \
+   INTEL_I915G_IDS(_i915g_info), \
+   INTEL_I915GM_IDS(_i915gm_info),   \
+   INTEL_I945G_IDS(_i945g_info), \
+   INTEL_I945GM_IDS(_i945gm_info),   \
+   INTEL_I965G_IDS(_i965g_info), \
+   INTEL_G33_IDS(_g33_info), \
+   INTEL_I965GM_IDS(_i965gm_info),   \
+   INTEL_GM45_IDS(_gm45_info),   \
+   INTEL_G45_IDS(_g45_info), \
+   INTEL_PINEVIEW_IDS(_pineview_info),   \
+   INTEL_IRONLAKE_D_IDS(_ironlake_d_info),   \
+   INTEL_IRONLAKE_M_IDS(_ironlake_m_info),   \
+   INTEL_SNB_D_IDS(_sandybridge_d_info), \
+   INTEL_SNB_M_IDS(_sandybridge_m_info), \
+   INTEL_IVB_Q_IDS(_ivybridge_q_info), /* must be first IVB */ \
+   INTEL_IVB_M_IDS(_ivybridge_m_info),   \
+   INTEL_IVB_D_IDS(_ivybridge_d_info),   \
+   INTEL_HSW_D_IDS(_haswell_d_info), \
+   INTEL_HSW_M_IDS(_haswell_m_info), \
+   INTEL_VLV_M_IDS(_valleyview_m_info),  \
+   INTEL_VLV_D_IDS(_valleyview_d_info)
+
 static const struct pci_device_id pciidlist[] = {  /* aka */
-   INTEL_VGA_DEVICE(0x3577, _i830_info), /* I830_M */
-   INTEL_VGA_DEVICE(0x2562, _845g_info), /* 845_G */
-   INTEL_VGA_DEVICE(0x3582, _i85x_info), /* I855_GM */
-   INTEL_VGA_DEVICE(0x358e, _i85x_info),
-   INTEL_VGA_DEVICE(0x2572, _i865g_info),/* I865_G */
-   INTEL_VGA_DEVICE(0x2582, _i915g_info),/* I915_G */
-   INTEL_VGA_DEVICE(0x258a, _i915g_info),/* E7221_G */
-   INTEL_VGA_DEVICE(0x2592, _i915gm_info),   /* I915_GM */
-   INTEL_VGA_DEVICE(0x2772, _i945g_info),/* I945_G */
-   INTEL_VGA_DEVICE(0x27a2, _i945gm_info),   /* I945_GM */
-   INTEL_VGA_DEVICE(0x27ae, _i945gm_info),   /* I945_GME */
-   INTEL_VGA_DEVICE(0x2972, _i965g_info),/* I946_GZ */
-   INTEL_VGA_DEVICE(0x2982, _i965g_info),/* G35_G */
-   INTEL_VGA_DEVICE(0x2992, _i965g_info),/* I965_Q */
-   INTEL_VGA_DEVICE(0x29a2, _i965g_info),/* I965_G */
-   INTEL_VGA_DEVICE(0x29b2, _g33_info),  /* Q35_G */
-   INTEL_VGA_DEVICE(0x29c2, _g33_info),  /* G33_G */
-   INTEL_VGA_DEVICE(0x29d2, _g33_info),  /* Q33_G */
-   INTEL_VGA_DEVICE(0x2a02, _i965gm_info)

Updated stolen mem patches

2013-07-26 Thread Jesse Barnes
These address the comments I've received so far, but omit the new E820
type for this mem.

Chris's patches could go on top if desired; they add a new type and
resource reservation function for looking up regions by name.  That
allows us to remove some duplicate code in the driver for finding stolen
space.

But I think these two are ready as-is.  How should we merge them?  Just
through the i915 tree since the first one touches our headers?  that and
there probably won't be conflicts on the early-quirks file; that's not
touch too often...

Thanks,
Jesse

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v3

2013-07-26 Thread Jesse Barnes
On Fri, 26 Jul 2013 09:58:45 -0700
"H. Peter Anvin"  wrote:

> On 07/25/2013 09:37 AM, Jesse Barnes wrote:
> > Systems with Intel graphics controllers set aside memory exclusively for
> > +   /*
> > +* Almost universally we can find the Graphics Base of Stolen Memory
> > +* at offset 0x5c in the igfx configuration space. On a few (desktop)
> > +* machines this is also mirrored in the bridge device at different
> > +* locations, or in the MCHBAR.
> > +*/
> 
> This comment makes me nervous.  It isn't clear to me if it is saying:
> 
> - All igfx devices has the graphics base at 0x5c, a few have it in
>   other places, too (which doesn't matter, we can use 0x5c anyway), or
> - Most igfx devices have the graphics base at 0x5c, some don't, and we
>   hope and pray we're not on one of those systems because we're not
>   checking.
> 
> I assume it is the former, but it really needs to be phrased better.

I mostly copied it from the driver where we had some other ways of
finding it too.  For the PCI IDs listed, we can always use 0x5c.  I'll
fix the comment.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-26 Thread Jesse Barnes
On Thu, 25 Jul 2013 17:31:29 -0700
Linus Torvalds  wrote:

> On Thu, Jul 25, 2013 at 3:42 PM, H. Peter Anvin  wrote:
> > So the bootloader is just as likely to step on things... what happens 
> > when/if it does?
> 
> This isn't a new problem. We've had this "firmware tables don't show
> all devices" issue before.
> 
> The only odd thing about this one is how the quirk in question uses
> "e820_add_region()" instead of just adding things to the MMIO list.
> And I think that's actually likely a mistake.
> 
> So Jesse, why don't you do what the other quirks do, and claim an
> actual MMIO resource? If you make it a real resource, you'll get to
> use fancy things like REAL NAMES, and actually document it. With
> human-readable strings.
> 
> See quirk_io_region() in drivers/pci/quirks.c for example. The same
> code except for IORESOURCE_MEM should do a lovely job..
> 
> And even *if* it's already marked reserved in the e820 table, it just
> looks nice in /proc/iomem.
> 
> Hmm?

I should have mentioned yesterday that we *do* try to reserve the
resource in our driver init, with pretty name and everything.

The issue here is making sure we are actually *able* to reserve it
without conflicts at driver init time.

If the PCI layer has put something there (misc MMIO or the "RAM
buffer" intended to prevent stuff like this) because it's not listed in
the E820 map, we'll fail to get at this memory in our driver init.

Thus the early reservation in early-quirks, followed by a real
request_mem_region later on.

Doing the request_mem_region before the PCI layer gets its hands on it
isn't really possible, because __request_region depends on the memory
allocator being initialized.  So rather than add a new hook elsewhere in
setup_arch or start_kernel I figured I'd use an early quirk.

Reasonable?  Note iomem in both cases.

Before (RAM buffer prevents our allocation):
cafff000-caff : System RAM
cb00-cbff : RAM buffer
cfa0-feaf : PCI Bus :00
  d000-dfff : :00:02.0

After (yay):
cb00-cb9f : RAM buffer
cba0-cf9f : reserved
  cba0-cf9f : Graphics Stolen Memory
cfa0-feaf : PCI Bus :00

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-26 Thread Jesse Barnes
On Thu, 25 Jul 2013 20:31:27 -0700
"H. Peter Anvin"  wrote:

> On 07/25/2013 07:14 PM, Jesse Barnes wrote:
> > To clarify: it'll either be marked reserved or not listed at all in e820, 
> > which is why I did this early, before any other e820 stuff like the "RAM 
> > buffer" are allocated, and before we could use the iomem resource (or maybe 
> > we could even early per Linus? I'll check). 
> > 
> > Jesse
> 
> If it is marked reserved or not listed at all it is much less of an
> issue.  Reserved is in fact the correct thing; not listed at all really
> isn't very problematic in most cases.

Yeah the problems seem to fall into two categories:
  1) mmio space is allocated in this range:
 https://bugs.freedesktop.org/show_bug.cgi?id=66726
  2) range gets partially allocated to the "RAM buffer"
 https://bugs.freedesktop.org/show_bug.cgi?id=66844

Case (1) is the one that worries me.  I'm guessing it'll mainly be a
problem on machines where MMIO space is limited or somehow structured
such that PCI resources end up there when we allocate them.  Depending
on what gets put there and the decode priority, behavior may be poor.

Case (2) isn't harmful, but ends up causing our driver to skip stolen
memory initialization, because of the conflict.

Anyway I'll look at Linus's suggestion of reserving in the iomem
resource really early and roll in Chris's stuff if that doesn't work
out.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-26 Thread Jesse Barnes
On Thu, 25 Jul 2013 20:31:27 -0700
H. Peter Anvin h...@zytor.com wrote:

 On 07/25/2013 07:14 PM, Jesse Barnes wrote:
  To clarify: it'll either be marked reserved or not listed at all in e820, 
  which is why I did this early, before any other e820 stuff like the RAM 
  buffer are allocated, and before we could use the iomem resource (or maybe 
  we could even early per Linus? I'll check). 
  
  Jesse
 
 If it is marked reserved or not listed at all it is much less of an
 issue.  Reserved is in fact the correct thing; not listed at all really
 isn't very problematic in most cases.

Yeah the problems seem to fall into two categories:
  1) mmio space is allocated in this range:
 https://bugs.freedesktop.org/show_bug.cgi?id=66726
  2) range gets partially allocated to the RAM buffer
 https://bugs.freedesktop.org/show_bug.cgi?id=66844

Case (1) is the one that worries me.  I'm guessing it'll mainly be a
problem on machines where MMIO space is limited or somehow structured
such that PCI resources end up there when we allocate them.  Depending
on what gets put there and the decode priority, behavior may be poor.

Case (2) isn't harmful, but ends up causing our driver to skip stolen
memory initialization, because of the conflict.

Anyway I'll look at Linus's suggestion of reserving in the iomem
resource really early and roll in Chris's stuff if that doesn't work
out.

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-26 Thread Jesse Barnes
On Thu, 25 Jul 2013 17:31:29 -0700
Linus Torvalds torva...@linux-foundation.org wrote:

 On Thu, Jul 25, 2013 at 3:42 PM, H. Peter Anvin h...@zytor.com wrote:
  So the bootloader is just as likely to step on things... what happens 
  when/if it does?
 
 This isn't a new problem. We've had this firmware tables don't show
 all devices issue before.
 
 The only odd thing about this one is how the quirk in question uses
 e820_add_region() instead of just adding things to the MMIO list.
 And I think that's actually likely a mistake.
 
 So Jesse, why don't you do what the other quirks do, and claim an
 actual MMIO resource? If you make it a real resource, you'll get to
 use fancy things like REAL NAMES, and actually document it. With
 human-readable strings.
 
 See quirk_io_region() in drivers/pci/quirks.c for example. The same
 code except for IORESOURCE_MEM should do a lovely job..
 
 And even *if* it's already marked reserved in the e820 table, it just
 looks nice in /proc/iomem.
 
 Hmm?

I should have mentioned yesterday that we *do* try to reserve the
resource in our driver init, with pretty name and everything.

The issue here is making sure we are actually *able* to reserve it
without conflicts at driver init time.

If the PCI layer has put something there (misc MMIO or the RAM
buffer intended to prevent stuff like this) because it's not listed in
the E820 map, we'll fail to get at this memory in our driver init.

Thus the early reservation in early-quirks, followed by a real
request_mem_region later on.

Doing the request_mem_region before the PCI layer gets its hands on it
isn't really possible, because __request_region depends on the memory
allocator being initialized.  So rather than add a new hook elsewhere in
setup_arch or start_kernel I figured I'd use an early quirk.

Reasonable?  Note iomem in both cases.

Before (RAM buffer prevents our allocation):
cafff000-caff : System RAM
cb00-cbff : RAM buffer
cfa0-feaf : PCI Bus :00
  d000-dfff : :00:02.0

After (yay):
cb00-cb9f : RAM buffer
cba0-cf9f : reserved
  cba0-cf9f : Graphics Stolen Memory
cfa0-feaf : PCI Bus :00

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v3

2013-07-26 Thread Jesse Barnes
On Fri, 26 Jul 2013 09:58:45 -0700
H. Peter Anvin h...@zytor.com wrote:

 On 07/25/2013 09:37 AM, Jesse Barnes wrote:
  Systems with Intel graphics controllers set aside memory exclusively for
  +   /*
  +* Almost universally we can find the Graphics Base of Stolen Memory
  +* at offset 0x5c in the igfx configuration space. On a few (desktop)
  +* machines this is also mirrored in the bridge device at different
  +* locations, or in the MCHBAR.
  +*/
 
 This comment makes me nervous.  It isn't clear to me if it is saying:
 
 - All igfx devices has the graphics base at 0x5c, a few have it in
   other places, too (which doesn't matter, we can use 0x5c anyway), or
 - Most igfx devices have the graphics base at 0x5c, some don't, and we
   hope and pray we're not on one of those systems because we're not
   checking.
 
 I assume it is the former, but it really needs to be phrased better.

I mostly copied it from the driver where we had some other ways of
finding it too.  For the PCI IDs listed, we can always use 0x5c.  I'll
fix the comment.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Updated stolen mem patches

2013-07-26 Thread Jesse Barnes
These address the comments I've received so far, but omit the new E820
type for this mem.

Chris's patches could go on top if desired; they add a new type and
resource reservation function for looking up regions by name.  That
allows us to remove some duplicate code in the driver for finding stolen
space.

But I think these two are ready as-is.  How should we merge them?  Just
through the i915 tree since the first one touches our headers?  that and
there probably won't be conflicts on the early-quirks file; that's not
touch too often...

Thanks,
Jesse

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] drm/i915: split PCI IDs out into i915_drm.h v4

2013-07-26 Thread Jesse Barnes
For use by userspace (at some point in the future) and other kernel code.

v2: move PCI IDs to uabi (Chris)
move PCI IDs to drm/ (Dave)
v3: fixup Quanta detection - needs to come first (Daniel)
v4: fix up PCI match structure init for easier use by userspace (Chris)

Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/gpu/drm/i915/i915_drv.c |  164 +++---
 include/drm/i915_drm.h  |2 +
 include/drm/i915_pciids.h   |  211 +++
 3 files changed, 247 insertions(+), 130 deletions(-)
 create mode 100644 include/drm/i915_pciids.h

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b07362f..e87bccf 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -140,25 +140,6 @@ MODULE_PARM_DESC(fastboot, Try to skip unnecessary mode 
sets at boot time 
 static struct drm_driver driver;
 extern int intel_agp_enabled;
 
-#define INTEL_VGA_DEVICE(id, info) {   \
-   .class = PCI_BASE_CLASS_DISPLAY  16,  \
-   .class_mask = 0xff, \
-   .vendor = 0x8086,   \
-   .device = id,   \
-   .subvendor = PCI_ANY_ID,\
-   .subdevice = PCI_ANY_ID,\
-   .driver_data = (unsigned long) info }
-
-#define INTEL_QUANTA_VGA_DEVICE(info) {\
-   .class = PCI_BASE_CLASS_DISPLAY  16,  \
-   .class_mask = 0xff, \
-   .vendor = 0x8086,   \
-   .device = 0x16a,\
-   .subvendor = 0x152d,\
-   .subdevice = 0x8990,\
-   .driver_data = (unsigned long) info }
-
-
 static const struct intel_device_info intel_i830_info = {
.gen = 2, .is_mobile = 1, .cursor_needs_physical = 1, .num_pipes = 2,
.has_overlay = 1, .overlay_needs_physical = 1,
@@ -333,118 +314,41 @@ static const struct intel_device_info 
intel_haswell_m_info = {
.has_vebox_ring = 1,
 };
 
+/*
+ * Make sure any device matches here are from most specific to most
+ * general.  For example, since the Quanta match is based on the subsystem
+ * and subvendor IDs, we need it to come before the more general IVB
+ * PCI ID matches, otherwise we'll use the wrong info struct above.
+ */
+#define INTEL_PCI_IDS \
+   INTEL_I830_IDS(intel_i830_info),   \
+   INTEL_I845G_IDS(intel_845g_info),  \
+   INTEL_I85X_IDS(intel_i85x_info),   \
+   INTEL_I865G_IDS(intel_i865g_info), \
+   INTEL_I915G_IDS(intel_i915g_info), \
+   INTEL_I915GM_IDS(intel_i915gm_info),   \
+   INTEL_I945G_IDS(intel_i945g_info), \
+   INTEL_I945GM_IDS(intel_i945gm_info),   \
+   INTEL_I965G_IDS(intel_i965g_info), \
+   INTEL_G33_IDS(intel_g33_info), \
+   INTEL_I965GM_IDS(intel_i965gm_info),   \
+   INTEL_GM45_IDS(intel_gm45_info),   \
+   INTEL_G45_IDS(intel_g45_info), \
+   INTEL_PINEVIEW_IDS(intel_pineview_info),   \
+   INTEL_IRONLAKE_D_IDS(intel_ironlake_d_info),   \
+   INTEL_IRONLAKE_M_IDS(intel_ironlake_m_info),   \
+   INTEL_SNB_D_IDS(intel_sandybridge_d_info), \
+   INTEL_SNB_M_IDS(intel_sandybridge_m_info), \
+   INTEL_IVB_Q_IDS(intel_ivybridge_q_info), /* must be first IVB */ \
+   INTEL_IVB_M_IDS(intel_ivybridge_m_info),   \
+   INTEL_IVB_D_IDS(intel_ivybridge_d_info),   \
+   INTEL_HSW_D_IDS(intel_haswell_d_info), \
+   INTEL_HSW_M_IDS(intel_haswell_m_info), \
+   INTEL_VLV_M_IDS(intel_valleyview_m_info),  \
+   INTEL_VLV_D_IDS(intel_valleyview_d_info)
+
 static const struct pci_device_id pciidlist[] = {  /* aka */
-   INTEL_VGA_DEVICE(0x3577, intel_i830_info), /* I830_M */
-   INTEL_VGA_DEVICE(0x2562, intel_845g_info), /* 845_G */
-   INTEL_VGA_DEVICE(0x3582, intel_i85x_info), /* I855_GM */
-   INTEL_VGA_DEVICE(0x358e, intel_i85x_info),
-   INTEL_VGA_DEVICE(0x2572, intel_i865g_info),/* I865_G */
-   INTEL_VGA_DEVICE(0x2582, intel_i915g_info),/* I915_G */
-   INTEL_VGA_DEVICE(0x258a, intel_i915g_info),/* E7221_G */
-   INTEL_VGA_DEVICE(0x2592, intel_i915gm_info),   /* I915_GM */
-   INTEL_VGA_DEVICE(0x2772, intel_i945g_info),/* I945_G */
-   INTEL_VGA_DEVICE(0x27a2, intel_i945gm_info),   /* I945_GM */
-   INTEL_VGA_DEVICE(0x27ae, intel_i945gm_info),   /* I945_GME */
-   INTEL_VGA_DEVICE(0x2972, intel_i965g_info),/* I946_GZ */
-   INTEL_VGA_DEVICE(0x2982, intel_i965g_info),/* G35_G */
-   INTEL_VGA_DEVICE(0x2992, intel_i965g_info),/* I965_Q */
-   INTEL_VGA_DEVICE(0x29a2, intel_i965g_info),/* I965_G */
-   INTEL_VGA_DEVICE(0x29b2, intel_g33_info),  /* Q35_G

[PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v5

2013-07-26 Thread Jesse Barnes
Systems with Intel graphics controllers set aside memory exclusively for
gfx driver use.  This memory is not always marked in the E820 as
reserved or as RAM, and so is subject to overlap from E820 manipulation
later in the boot process.  On some systems, MMIO space is allocated on
top, despite the efforts of the RAM buffer approach, which simply
rounds memory boundaries up to 64M to try to catch space that may decode
as RAM and so is not suitable for MMIO.

v2: use read_pci_config for 32 bit reads instead of adding a new one
(Chris)
add gen6 stolen size function (Chris)
v3: use a function pointer (Chris)
drop gen2 bits (Daniel)
v4: call e820_sanitize_map after adding the region
v5: fixup comments (Peter)
simplify loop (Chris)

Acked-by: Ingo Molnar mi...@kernel.org
Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 arch/x86/kernel/early-quirks.c  |  154 +++
 drivers/gpu/drm/i915/i915_reg.h |   15 
 include/drm/i915_drm.h  |   32 
 3 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 94ab6b9..580080c 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -12,6 +12,7 @@
 #include linux/pci.h
 #include linux/acpi.h
 #include linux/pci_ids.h
+#include drm/i915_drm.h
 #include asm/pci-direct.h
 #include asm/dma.h
 #include asm/io_apic.h
@@ -208,6 +209,157 @@ static void __init intel_remapping_check(int num, int 
slot, int func)
 
 }
 
+/*
+ * Systems with Intel graphics controllers set aside memory exclusively
+ * for gfx driver use.  This memory is not marked in the E820 as reserved
+ * or as RAM, and so is subject to overlap from E820 manipulation later
+ * in the boot process.  On some systems, MMIO space is allocated on top,
+ * despite the efforts of the RAM buffer approach, which simply rounds
+ * memory boundaries up to 64M to try to catch space that may decode
+ * as RAM and so is not suitable for MMIO.
+ *
+ * And yes, so far on current devices the base addr is always under 4G.
+ */
+static u32 __init intel_stolen_base(int num, int slot, int func)
+{
+   u32 base;
+
+   /*
+* For the PCI IDs in this quirk, the stolen base is always
+* in 0x5c, aka the BDSM register (yes that's really what
+* it's called).
+*/
+   base = read_pci_config(num, slot, func, 0x5c);
+   base = ~((120) - 1);
+
+   return base;
+}
+
+#define KB(x)  ((x) * 1024)
+#define MB(x)  (KB (KB (x)))
+#define GB(x)  (MB (KB (x)))
+
+static size_t __init gen3_stolen_size(int num, int slot, int func)
+{
+   size_t stolen_size;
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(0, 0, 0, I830_GMCH_CTRL);
+
+   switch (gmch_ctrl  I855_GMCH_GMS_MASK) {
+   case I855_GMCH_GMS_STOLEN_1M:
+   stolen_size = MB(1);
+   break;
+   case I855_GMCH_GMS_STOLEN_4M:
+   stolen_size = MB(4);
+   break;
+   case I855_GMCH_GMS_STOLEN_8M:
+   stolen_size = MB(8);
+   break;
+   case I855_GMCH_GMS_STOLEN_16M:
+   stolen_size = MB(16);
+   break;
+   case I855_GMCH_GMS_STOLEN_32M:
+   stolen_size = MB(32);
+   break;
+   case I915_GMCH_GMS_STOLEN_48M:
+   stolen_size = MB(48);
+   break;
+   case I915_GMCH_GMS_STOLEN_64M:
+   stolen_size = MB(64);
+   break;
+   case G33_GMCH_GMS_STOLEN_128M:
+   stolen_size = MB(128);
+   break;
+   case G33_GMCH_GMS_STOLEN_256M:
+   stolen_size = MB(256);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_96M:
+   stolen_size = MB(96);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_160M:
+   stolen_size = MB(160);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_224M:
+   stolen_size = MB(224);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_352M:
+   stolen_size = MB(352);
+   break;
+   default:
+   stolen_size = 0;
+   break;
+   }
+
+   return stolen_size;
+}
+
+static size_t __init gen6_stolen_size(int num, int slot, int func)
+{
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL);
+   gmch_ctrl = SNB_GMCH_GMS_SHIFT;
+   gmch_ctrl = SNB_GMCH_GMS_MASK;
+
+   return gmch_ctrl  25; /* 32 MB units */
+}
+
+typedef size_t (*stolen_size_fn)(int num, int slot, int func);
+
+static struct pci_device_id intel_stolen_ids[] __initdata = {
+   INTEL_I915G_IDS(gen3_stolen_size),
+   INTEL_I915GM_IDS(gen3_stolen_size),
+   INTEL_I945G_IDS(gen3_stolen_size),
+   INTEL_I945GM_IDS(gen3_stolen_size),
+   INTEL_VLV_M_IDS(gen3_stolen_size),
+   INTEL_VLV_D_IDS(gen3_stolen_size),
+   INTEL_PINEVIEW_IDS(gen3_stolen_size

Re: [PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v3

2013-07-25 Thread Jesse Barnes
On Thu, 25 Jul 2013 23:59:25 +0100
Chris Wilson  wrote:

> Hmm, interesting licence block in i915_pciids.h - our claim to
> grant licence but TG disclaims liability.

Oops my manual search & replace obviously failed.  Will fix up.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-25 Thread Jesse Barnes
Well, it's ok if the boot loader writes to this memory, the worst
that'll happen is you'll see garbage on the screen.  If the boot loader
tries to do MMIO mapping on top it'll get into trouble... but why would
it do that?

Jesse

On Thu, 25 Jul 2013 15:42:25 -0700
"H. Peter Anvin"  wrote:

> So the bootloader is just as likely to step on things... what happens when/if 
> it does?
> 
> Ingo Molnar  wrote:
> >
> >* Jesse Barnes  wrote:
> >
> >> Patch 2/2 has the description, but suffice it to say I'm 
> >> not really pleased with this, though it does solve a 
> >> problem we have.  On some machines, we get MMIO space 
> >> allocated on top of this hidden memory, which can cause 
> >> problems.  I'm not sure if there are similar problems for 
> >> other hunks of the address space; if so it's possible 
> >> this could be made more general (though the bits for 
> >> looking up the address of this region are definitely 
> >> Intel graphics specific).
> >
> >It looks pretty hardware specific. Discovering it the hard 
> >way and marking it e820 reserved in an early quirk is what 
> >the firmware should have done to begin with - and I doubt 
> >the kernel could do anything significantly cleaner.
> >
> >How does Windows manage to not crash? By luckily never 
> >allocating PCI resources on top of the RAM? Or does it have 
> >a quirk?
> >
> >> Chris has some patches on top to add a new E820 type so 
> >> we can look up the region later, which removes some 
> >> redundant code in the i915 driver at least.
> >> 
> >> Any comments?  I assume no one likes this, but maybe it's 
> >> just another early quirk we'll have to live with...
> >
> >No strong feelings against it - my only suggestion would be 
> >to make this more visible - right now it's added as e820 
> >reserved which hides amongst other areas already marked 
> >reserved - would a low-key printk() of the range added make 
> >it more apparent that a kernel quirk activated here?
> >
> >Just so that people know that it came from the kernel, not 
> >the firmware.
> >
> >But in any case:
> >
> >Acked-by: Ingo Molnar 
> >
> >Thanks,
> >
> > Ingo
> 


-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ugly patches for stolen reservation

2013-07-25 Thread Jesse Barnes
On Thu, 25 Jul 2013 22:05:51 +0200
Ingo Molnar  wrote:

> 
> * Jesse Barnes  wrote:
> 
> > Patch 2/2 has the description, but suffice it to say I'm 
> > not really pleased with this, though it does solve a 
> > problem we have.  On some machines, we get MMIO space 
> > allocated on top of this hidden memory, which can cause 
> > problems.  I'm not sure if there are similar problems for 
> > other hunks of the address space; if so it's possible 
> > this could be made more general (though the bits for 
> > looking up the address of this region are definitely 
> > Intel graphics specific).
> 
> It looks pretty hardware specific. Discovering it the hard 
> way and marking it e820 reserved in an early quirk is what 
> the firmware should have done to begin with - and I doubt 
> the kernel could do anything significantly cleaner.
> 
> How does Windows manage to not crash? By luckily never 
> allocating PCI resources on top of the RAM? Or does it have 
> a quirk?

Pretty sure Windows doesn't allocate MMIO space the same way we do, so
doesn't run into this on platforms where it's not E820_RESERVED.  On
top of that, BIOS vendors probably just move things around until
Windows boots and the device manager doesn't have any dreaded "yellow
bang" icons that would prevent them from getting their "designed for
windows" sticker.

> > Chris has some patches on top to add a new E820 type so 
> > we can look up the region later, which removes some 
> > redundant code in the i915 driver at least.
> > 
> > Any comments?  I assume no one likes this, but maybe it's 
> > just another early quirk we'll have to live with...
> 
> No strong feelings against it - my only suggestion would be 
> to make this more visible - right now it's added as e820 
> reserved which hides amongst other areas already marked 
> reserved - would a low-key printk() of the range added make 
> it more apparent that a kernel quirk activated here?

Sounds good, I think Chris's patches should satisfy there.  They make
it a new E820 type so it's clear in /proc/iomem too.

> 
> Just so that people know that it came from the kernel, not 
> the firmware.
> 
> But in any case:
> 
> Acked-by: Ingo Molnar 

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] x86: add early quirk for reserving Intel graphics stolen memory v3

2013-07-25 Thread Jesse Barnes
Systems with Intel graphics controllers set aside memory exclusively for
gfx driver use.  This memory is not marked in the E820 as reserved or as
RAM, and so is subject to overlap from E820 manipulation later in the
boot process.  On some systems, MMIO space is allocated on top, despite
the efforts of the "RAM buffer" approach, which simply rounds memory
boundaries up to 64M to try to catch space that may decode as RAM and so
is not suitable for MMIO.

v2: use read_pci_config for 32 bit reads instead of adding a new one
(Chris)
add gen6 stolen size function (Chris)
use a function pointer (Chris)
drop gen2 bits (Daniel)

Signed-off-by: Jesse Barnes 
---
 arch/x86/kernel/early-quirks.c  |  158 +++
 drivers/gpu/drm/i915/i915_reg.h |   15 
 include/drm/i915_drm.h  |   32 
 3 files changed, 190 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 94ab6b9..bff8a6f 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -208,6 +209,161 @@ static void __init intel_remapping_check(int num, int 
slot, int func)
 
 }
 
+/*
+ * Systems with Intel graphics controllers set aside memory exclusively
+ * for gfx driver use.  This memory is not marked in the E820 as reserved
+ * or as RAM, and so is subject to overlap from E820 manipulation later
+ * in the boot process.  On some systems, MMIO space is allocated on top,
+ * despite the efforts of the "RAM buffer" approach, which simply rounds
+ * memory boundaries up to 64M to try to catch space that may decode
+ * as RAM and so is not suitable for MMIO.
+ *
+ * And yes, so far on current devices the base addr is always under 4G.
+ */
+static u32 __init intel_stolen_base(int num, int slot, int func)
+{
+   u32 base;
+
+   /*
+* Almost universally we can find the Graphics Base of Stolen Memory
+* at offset 0x5c in the igfx configuration space. On a few (desktop)
+* machines this is also mirrored in the bridge device at different
+* locations, or in the MCHBAR.
+*/
+   base = read_pci_config(num, slot, func, 0x5c);
+   base &= ~((1<<20) - 1);
+
+   return base;
+}
+
+#define KB(x)  ((x) * 1024)
+#define MB(x)  (KB (KB (x)))
+#define GB(x)  (MB (KB (x)))
+
+static size_t __init gen3_stolen_size(int num, int slot, int func)
+{
+   size_t stolen_size;
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(0, 0, 0, I830_GMCH_CTRL);
+
+   switch (gmch_ctrl & I855_GMCH_GMS_MASK) {
+   case I855_GMCH_GMS_STOLEN_1M:
+   stolen_size = MB(1);
+   break;
+   case I855_GMCH_GMS_STOLEN_4M:
+   stolen_size = MB(4);
+   break;
+   case I855_GMCH_GMS_STOLEN_8M:
+   stolen_size = MB(8);
+   break;
+   case I855_GMCH_GMS_STOLEN_16M:
+   stolen_size = MB(16);
+   break;
+   case I855_GMCH_GMS_STOLEN_32M:
+   stolen_size = MB(32);
+   break;
+   case I915_GMCH_GMS_STOLEN_48M:
+   stolen_size = MB(48);
+   break;
+   case I915_GMCH_GMS_STOLEN_64M:
+   stolen_size = MB(64);
+   break;
+   case G33_GMCH_GMS_STOLEN_128M:
+   stolen_size = MB(128);
+   break;
+   case G33_GMCH_GMS_STOLEN_256M:
+   stolen_size = MB(256);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_96M:
+   stolen_size = MB(96);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_160M:
+   stolen_size = MB(160);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_224M:
+   stolen_size = MB(224);
+   break;
+   case INTEL_GMCH_GMS_STOLEN_352M:
+   stolen_size = MB(352);
+   break;
+   default:
+   stolen_size = 0;
+   break;
+   }
+
+   return stolen_size;
+}
+
+static size_t __init gen6_stolen_size(int num, int slot, int func)
+{
+   u16 gmch_ctrl;
+
+   gmch_ctrl = read_pci_config_16(num, slot, func, SNB_GMCH_CTRL);
+   gmch_ctrl >>= SNB_GMCH_GMS_SHIFT;
+   gmch_ctrl &= SNB_GMCH_GMS_MASK;
+
+   return gmch_ctrl << 25; /* 32 MB units */
+}
+
+typedef size_t (*stolen_size_fn)(int num, int slot, int func);
+
+static struct pci_device_id intel_stolen_ids[] __initdata = {
+   INTEL_I915G_IDS(gen3_stolen_size),
+   INTEL_I915GM_IDS(gen3_stolen_size),
+   INTEL_I945G_IDS(gen3_stolen_size),
+   INTEL_I945GM_IDS(gen3_stolen_size),
+   INTEL_VLV_M_IDS(gen3_stolen_size),
+   INTEL_VLV_D_IDS(gen3_stolen_size),
+   INTEL_PINEVIEW_IDS(gen3_stolen_size),
+   INTEL_I965G_IDS(gen3_stolen_size),
+   INTEL_G33_IDS(gen3_stolen_size),
+   INTEL_I9

Ugly patches for stolen reservation

2013-07-25 Thread Jesse Barnes
Patch 2/2 has the description, but suffice it to say I'm not really
pleased with this, though it does solve a problem we have.  On some
machines, we get MMIO space allocated on top of this hidden memory,
which can cause problems.  I'm not sure if there are similar problems
for other hunks of the address space; if so it's possible this could be
made more general (though the bits for looking up the address of this
region are definitely Intel graphics specific).

Chris has some patches on top to add a new E820 type so we can look up
the region later, which removes some redundant code in the i915 driver
at least.

Any comments?  I assume no one likes this, but maybe it's just another
early quirk we'll have to live with...

Thanks,
Jesse

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   >