[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Bridgman, John
>From: Ilyes Gouta [mailto:ilyes.gouta at gmail.com] 
>Sent: Friday, July 11, 2014 2:00 PM
>To: Bridgman, John
>Cc: Alex Deucher; Koenig, Christian; Oded Gabbay; Deucher, Alexander; Lewycky, 
>Andrew; LKML; Maling list - DRI developers
>Subject: Re: [PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes 
>in KV
>
>Hi,
>
>Just a side question (for information),
>
>On Fri, Jul 11, 2014 at 6:07 PM, Bridgman, John  
>wrote:
>
>Right. The SET_RESOURCES packet (kfd_pm4_headers.h, added in patch 49) 
>allocates a range of HW queues, VMIDs and GDS to the HW scheduler, then >the 
>scheduler uses the allocated VMIDs to support a potentially larger number of 
>user processes by dynamically mapping PASIDs to VMIDs and memory >queue 
>descriptors (MQDs) to HW queues.
>
>Are there any documentation/specifications online describing these mechanisms?

Nothing yet, but we should write some docco for this similar to what was 
written for the gfx blocks. I'll add that to the list, thanks.


[Bug 72785] bfgminer --scrypt on 7xxx+

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=72785

--- Comment #25 from Christoph Haag  ---
So with 3.16, newest mesa git, llvm trunk from a few days ago etc. I'm getting
about 12.0 khash/sec out of my HD 7970M with all the stuff one should set:

GPU_MAX_ALLOC_PERCENT=100 GPU_USE_SYNC_OBJECTS=1 bfgminer -v1 --scrypt -S
opencl:auto --url=stratum+tcp://dogepool.pw:3334 --userpass=XX
--intensity=8 --shaders=1280

from the debug output:

 [2014-07-12 00:47:12] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:16] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:20] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:24] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:28] [thread 0: 47872 hashes, 11.7 khash/sec]
 [2014-07-12 00:47:30] 20s: 11.9 avg: 11.5 u:  0.0 kh/s | A:0 R:0+0(none)
HW:0/none

But it's by far not getting as hot as it should (?) with about 96% gpu usage
according to radeontop (it's 61?C but I think it should be 80-90?C and I also
have it set to "high" and "performance").

And if I read that right performance is about 1/30 it could be (350 KH/S)

So...

Close this bug because it does indeed run or keep it open for performance
issues?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/07c7710d/attachment.html>


[RFC] dma-buf: Implement test module

2014-07-11 Thread Sam Ravnborg
On Thu, Jul 10, 2014 at 11:55:26AM +0200, Thierry Reding wrote:
> On Wed, Mar 26, 2014 at 09:32:47AM +0100, Thierry Reding wrote:
> > On Tue, Mar 25, 2014 at 07:01:10PM +0100, Sam Ravnborg wrote:
> > > > 
> > > > There are two things that don't work too well with this. First this
> > > > causes the build to break if the build machine doesn't have the new
> > > > public header (include/uapi/linux/dma-buf.h) installed yet. So the only
> > > > way to make this work would be by building the kernel once with SAMPLES
> > > > disabled, install the headers and then build again with SAMPLES enabled.
> > > > Which really isn't very nice.
> > > > 
> > > > One other option that I've tried is to modify the include path so that
> > > > the test program would get the in-tree copy of the public header file,
> > > > but that didn't build properly either because the header files aren't
> > > > properly sanitized and therefore the compiler complains about it
> > > > (include/uapi/linux/types.h).
> > > > 
> > > > One other disadvantage of carrying the sample program in the tree is
> > > > that there's only infrastructure to build programs natively on the build
> > > > machine. That's somewhat unfortunate because if you want to run the test
> > > > program on a different architecture you have to either compile the
> > > > kernel natively on that architecture (which isn't very practical on many
> > > > embedded devices) or cross-compile manually.
> > > > 
> > > > I think a much nicer solution would be to add infrastructure to cross-
> > > > compile these test programs, so that they end up being built for the
> > > > same architecture as the kernel image (i.e. using CROSS_COMPILE).
> > > > 
> > > > Adding Michal and the linux-kbuild mailing list, perhaps this has been
> > > > discussed before, or maybe somebody has a better idea on how to solve
> > > > this.
> > > I actually looked into this some time ago.
> > > May try to dust off the patch.
> > > IIRC the kernel provided headers were used for building - not the one 
> > > installed on the machine.
> > > And crosscompile were supported.
> > 
> > That sounds exactly like what I'd want for this. If you need any help,
> > please let me know.
> 
> Did you have any time to look into dusting off the patch? If not I'll
> gladly take whatever you have and dust it off myself.
Thanks for the reminder.
I got it almost working after simplifying it a lot.
I will be travelling for the next few days but will continue to work on this
after the weekend.

Sam


[Bug 73053] dpm hangs with BTC parts

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73053

--- Comment #38 from almos  ---
Attachment 102081 fixes the "hard lockup with small vertical blue stripes"
issue, when applied to 3.15.4, and AFAICS dpm works fine.

The new problem is that I get kernel panic after a few hours if dpm is enabled.
With the good old profile method the system is stable.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/5aad98ae/attachment.html>


[GIT PULL] Armada DRM devel updates

2014-07-11 Thread Russell King
David,

Please incorporate the latest Armada DRM updates, which can be found at:

  git://ftp.arm.linux.org.uk/~rmk/linux-arm.git drm-armada-devel

with SHA1 9611cb93fa65dde199f4f888bd034ffc80c7adf0, based on v3.16-rc3.

This pull includes the component helpers which have been merged into
Greg's driver-devel tree, and the new DRM OF helper file, which
Rob has reviewed, along with the Armada DRM updates.

The Armada DRM changes:
- move the interrupt handling solely into the CRTC layer, otherwise
  we have problems as each CRTC has its own interrupt signal.
- change the way CRTCs are numbered, to use the number of CRTCs which
  have already been registered.  This ultimately equates to the same
  number, so we achieve the same thing but in a simpler way.
- move the variant initialisation from the DRM driver layer to the CRTC
  layer, which is really where it's needed, and make the variant
  handling purely a CRTC thing.
- augment Armada DRM with the component helper to allow multiple
  struct device's to describe the DRM subsystem.  This will be necessary
  for DT support, as each LCD controller is described as a separate node
  in DT, thus creating separate device structures for each LCD controller.
- tweak the external reference clock name to match the documentation
  more exactly - there is no underscore before the '1'.
- allow CRTCs to be registered as separate devices, thereby allowing
  DT to describe the LCD controllers, while preserving the remainder of
  the original behaviour.  (The original driver behaviour is still
  available at this time.)
- register the CRTCs with the DT node so that the recently introduced
  DRM OF helper can be used by encoders to locate their associated
  CRTCs.

This will update the following files:

 .../bindings/drm/armada/marvell,dove-lcd.txt   |  30 +++
 drivers/base/component.c   | 192 +---
 drivers/gpu/drm/Makefile   |   1 +
 drivers/gpu/drm/armada/armada_510.c|  23 +-
 drivers/gpu/drm/armada/armada_crtc.c   | 187 ++--
 drivers/gpu/drm/armada/armada_crtc.h   |  11 +-
 drivers/gpu/drm/armada/armada_drm.h|  13 +-
 drivers/gpu/drm/armada/armada_drv.c| 245 +++--
 drivers/gpu/drm/drm_of.c   |  67 ++
 include/drm/drm_crtc.h |   2 +
 include/drm/drm_of.h   |  18 ++
 include/linux/component.h  |   7 +
 12 files changed, 642 insertions(+), 154 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/drm/armada/marvell,dove-lcd.txt
 create mode 100644 drivers/gpu/drm/drm_of.c
 create mode 100644 include/drm/drm_of.h

through these changes:

Russell King (15):
  component: fix missed cleanup in case of devres failure
  component: ignore multiple additions of the same component
  component: add support for component match array
  drm/armada: move IRQ handling into CRTC
  drm/armada: use number of CRTCs registered
  drm/armada: move variant initialisation to CRTC init
  drm/armada: make variant a CRTC thing
  component: fix bug with legacy API
  drm: add of_graph endpoint helper to find possible CRTCs
  Merge branches 'drm-devel' and 'component-for-driver' into armada-drm
  drm/armada: convert to componentized support
  drm/armada: update Armada 510 (Dove) to use "ext_ref_clk1" as the clock
  dt-bindings: add Marvell Dove LCD controller documentation
  drm/armada: permit CRTCs to be registered as separate devices
  drm/armada: register crtc with port

Many thanks.


[Xen-devel] [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-11 Thread Tian, Kevin
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk at oracle.com]
> Sent: Friday, July 11, 2014 12:42 PM
> 
> On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> > On Thu, Jul 10, 2014 at 09:08:24PM +, Tian, Kevin wrote:
> > > actually I'm curious whether it's still necessary to __detect__ PCH. Could
> > > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > > code the knowledge:
> > >
> > >   } else if (IS_BROADWELL(dev)) {
> > >   dev_priv->pch_type = PCH_LPT;
> > >   dev_priv->pch_id =
> > >
> INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> > >   DRM_DEBUG_KMS("This is Broadwell,
> assuming "
> > > "LynxPoint LP PCH\n");
> > >
> > > Or if there is real usage on non-fixed mapping (not majority), could it 
> > > be a
> > > better option to have fixed mapping as a fallback instead of leaving as
> > > PCH_NONE? Then even when Qemu doesn't provide a special tweaked
> PCH,
> > > the majority case just works.
> >
> > I guess we can do it, at least I haven't seen any strange combinations in
> > the wild outside of Intel ...
> 
> How big is the QA matrix for this? Would it make sense to just
> include the latest hardware (say going two generations back)
> and ignore the older one?

suppose minimal or no QA effort on bare metal, if we only conservatively 
change the fallback path which is today not supposed to function with 
PCH_NONE. so it's only same amount of QA effort as whatever else is 
proposed in this passthru upstreaming task. I agree no need to cover 
older model, possibly just snb, ivb and hsw, but will leave Tiejun to answer 
the overall goal.

Thanks
Kevin


[Bug 69723] GPU lockups with kernel 3.11.0 / 3.12-rc1 when dpm=1 on r600g (Cayman)

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=69723

--- Comment #124 from Alexandre Demers  ---
(In reply to comment #123)
> I just got 1 lockup. The 1st in 10 days.

Do you have exactly the same symptoms as before? On my side, everything is
still fine. The only problem I've encountered is related to X.

I've been running and testing everything with kernel 3.16 which has some other
fixes related to Cayman, so maybe you encountered a different issue than the
one from the current bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/ec0fef18/attachment.html>


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 7:04 PM, Jerome Glisse  wrote:
> Are we to assume that for eternity this will not work on iommu that do support
> PASID/ATS but are not from AMD ? If it was an APU specific function i would
> understand but it seems that the IOMMU API needs to grow. I am pretty sure
> Intel will have an ATS/PASID IOMMU.

Also this isn't just for gpus - I hear noises that it e.g. could also
be used to virtualize a single ethernet NIC to different guest OS
directly. Adding ats/pasid support to the linux iommu interfaces
sounds like the right approach to me.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Ilyes Gouta
Hi,

Just a side question (for information),

On Fri, Jul 11, 2014 at 6:07 PM, Bridgman, John 
wrote:

>
> Right. The SET_RESOURCES packet (kfd_pm4_headers.h, added in patch 49)
> allocates a range of HW queues, VMIDs and GDS to the HW scheduler, then the
> scheduler uses the allocated VMIDs to support a potentially larger number
> of user processes by dynamically mapping PASIDs to VMIDs and memory queue
> descriptors (MQDs) to HW queues.
>

Are there any documentation/specifications online describing these
mechanisms?

Thanks,
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/bd9bbc2b/attachment.html>


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Bridgman, John


>-Original Message-
>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>Sent: Friday, July 11, 2014 2:52 PM
>To: Bridgman, John
>Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
>kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky, Andrew;
>Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon
>Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada;
>Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
>Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
>AMD's GPUs
>
>On Fri, Jul 11, 2014 at 06:46:30PM +, Bridgman, John wrote:
>> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
>> >Sent: Friday, July 11, 2014 2:11 PM
>> >To: Bridgman, John
>> >Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
>> >kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky,
>> >Andrew; Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J.
>> >Wysocki; Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke;
>> >Srinivas Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
>> >Philipp Zabel
>> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
>> >for AMD's GPUs
>> >
>> >On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
>> >> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
>> >> >Sent: Friday, July 11, 2014 1:04 PM
>> >> >To: Oded Gabbay
>> >> >Cc: David Airlie; Deucher, Alexander;
>> >> >linux-kernel at vger.kernel.org;
>> >> >dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
>> >> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki;
>> >> >Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas
>> >> >Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
>> >> >Philipp Zabel
>> >> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
>> >> >for AMD's GPUs
>> >> >
>> >> >On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
>> >> >> This patch adds the code base of the hsa driver for AMD's GPUs.
>> >> >>
>> >> >> This driver is called kfd.
>> >> >>
>> >> >> This initial version supports the first HSA chip, Kaveri.
>> >> >>
>> >> >> This driver is located in a new directory structure under drivers/gpu.
>> >> >>
>> >> >> Signed-off-by: Oded Gabbay 
>> >> >
>> >> >There is too coding style issues. While we have been lax on the
>> >> >enforcing the scripts/checkpatch.pl rules i think there is a limit
>> >> >to that. I am not strict on the 80chars per line but others things
>> >> >needs fixing
>> >so we stay inline.
>> >> >
>> >> >Also i am a bit worried about the license, given top comment in
>> >> >each of the files i am not sure this is GPL2 compatible. I would
>> >> >need to ask lawyer to review that.
>> >> >
>> >>
>> >> Hi Jerome,
>> >>
>> >> Which line in the license are you concerned about ? In theory we're
>> >> using
>> >the same license as the initial code pushes for radeon, and I just
>> >did a side-by side compare with the license header on cik.c in the
>> >radeon tree and confirmed that the two licenses are identical.
>> >>
>> >> The cik.c header has an additional "Authors:" line which the kfd
>> >> files do
>> >not, but AFAIK that is not part of the license text proper.
>> >>
>> >
>> >You can not claim GPL if you want to use this license. radeon is
>> >weird best for historical reasons as we wanted to share code with BSD
>> >thus it is dual licensed and this is reflected with :
>> >MODULE_LICENSE("GPL and additional rights");
>> >
>> >inside radeon_drv.c
>> >
>> >So if you want to have MODULE_LICENSE(GPL) then you should have
>> >header that use the GPL license wording and no wording from BSD like
>license.
>> >Otherwise change the MODULE_LICENSE and it would also be good to say
>> >dual licensed at top of each files (or least next to each license) so
>> >that it is clear this is BSD & GPL license.
>>
>> Got it. Missed that we had a different MODULE_LICENSE.
>>
>> Since the goal is license compatibility with radeon so we can update the
>interface and move code between the drivers in future I guess my
>preference would be to update MODULE_LICENSE in the kfd code to "GPL and
>additional rights", do you think that would be OK ?
>
>I am not a lawyer and nothing that i said should be considered as legal advice
>(on the contrary ;)) I think you need to be more clear with each license to
>clear says GPLv2 or BSD ie dual licensed but the dual license is a beast you
>would definitly want to talk to lawyer about.

Yeah, dual license seems horrid in its implications for developers so we've 
always tried to avoid it. GPL hurts us for porting to other OSes so the X11 / 
"GPL with additional rights" combo seemed like the ideal solution and we made 
it somewhat of a corporate standard. Hope that doesn't come back to haunt us. 

Meditate on this I will. Thanks !

>
>Cheers,
>J?r?me


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Bridgman, John


>-Original Message-
>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>Sent: Friday, July 11, 2014 2:11 PM
>To: Bridgman, John
>Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
>kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky, Andrew;
>Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon
>Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada;
>Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
>Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
>AMD's GPUs
>
>On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
>> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
>> >Sent: Friday, July 11, 2014 1:04 PM
>> >To: Oded Gabbay
>> >Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org;
>> >dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
>> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki;
>> >Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas
>> >Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp
>> >Zabel
>> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
>> >for AMD's GPUs
>> >
>> >On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
>> >> This patch adds the code base of the hsa driver for AMD's GPUs.
>> >>
>> >> This driver is called kfd.
>> >>
>> >> This initial version supports the first HSA chip, Kaveri.
>> >>
>> >> This driver is located in a new directory structure under drivers/gpu.
>> >>
>> >> Signed-off-by: Oded Gabbay 
>> >
>> >There is too coding style issues. While we have been lax on the
>> >enforcing the scripts/checkpatch.pl rules i think there is a limit to
>> >that. I am not strict on the 80chars per line but others things needs fixing
>so we stay inline.
>> >
>> >Also i am a bit worried about the license, given top comment in each
>> >of the files i am not sure this is GPL2 compatible. I would need to
>> >ask lawyer to review that.
>> >
>>
>> Hi Jerome,
>>
>> Which line in the license are you concerned about ? In theory we're using
>the same license as the initial code pushes for radeon, and I just did a 
>side-by
>side compare with the license header on cik.c in the radeon tree and
>confirmed that the two licenses are identical.
>>
>> The cik.c header has an additional "Authors:" line which the kfd files do
>not, but AFAIK that is not part of the license text proper.
>>
>
>You can not claim GPL if you want to use this license. radeon is weird best for
>historical reasons as we wanted to share code with BSD thus it is dual
>licensed and this is reflected with :
>MODULE_LICENSE("GPL and additional rights");
>
>inside radeon_drv.c
>
>So if you want to have MODULE_LICENSE(GPL) then you should have header
>that use the GPL license wording and no wording from BSD like license.
>Otherwise change the MODULE_LICENSE and it would also be good to say
>dual licensed at top of each files (or least next to each license) so that it 
>is
>clear this is BSD & GPL license.

Got it. Missed that we had a different MODULE_LICENSE.

Since the goal is license compatibility with radeon so we can update the 
interface and move code between the drivers in future I guess my preference 
would be to update MODULE_LICENSE in the kfd code to "GPL and additional 
rights", do you think that would be OK ?
>
>Cheers,
>J?r?me


[Nouveau] [PATCH v4 4/6] drm/nouveau: synchronize BOs when required

2014-07-11 Thread Alexandre Courbot
On 07/11/2014 04:41 PM, Daniel Vetter wrote:
> On Fri, Jul 11, 2014 at 11:40:27AM +0900, Alexandre Courbot wrote:
>> On 07/10/2014 10:04 PM, Daniel Vetter wrote:
>>> On Tue, Jul 08, 2014 at 05:25:59PM +0900, Alexandre Courbot wrote:
 On architectures for which access to GPU memory is non-coherent,
 caches need to be flushed and invalidated explicitly when BO control
 changes between CPU and GPU.

 This patch adds buffer synchronization functions which invokes the
 correct API (PCI or DMA) to ensure synchronization is effective.

 Based on the TTM DMA cache helper patches by Lucas Stach.

 Signed-off-by: Lucas Stach 
 Signed-off-by: Alexandre Courbot 
 ---
   drivers/gpu/drm/nouveau/nouveau_bo.c  | 56 
 +++
   drivers/gpu/drm/nouveau/nouveau_bo.h  |  2 ++
   drivers/gpu/drm/nouveau/nouveau_gem.c | 12 
   3 files changed, 70 insertions(+)

 diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
 b/drivers/gpu/drm/nouveau/nouveau_bo.c
 index 67e9e8e2e2ec..47e4e8886769 100644
 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
 +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
 @@ -402,6 +402,60 @@ nouveau_bo_unmap(struct nouveau_bo *nvbo)
ttm_bo_kunmap(>kmap);
   }

 +void
 +nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
 +{
 +  struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
 +  struct nouveau_device *device = nouveau_dev(drm->dev);
 +  struct ttm_dma_tt *ttm_dma = (struct ttm_dma_tt *)nvbo->bo.ttm;
 +  int i;
 +
 +  if (!ttm_dma)
 +  return;
 +
 +  if (nv_device_is_cpu_coherent(device) || nvbo->force_coherent)
 +  return;
>>>
>>> Is the is_cpu_coherent check really required? On coherent platforms the
>>> sync_for_foo should be a noop. It's the dma api's job to encapsulate this
>>> knowledge so that drivers can be blissfully ignorant. The explicit
>>> is_coherent check makes this a bit leaky. And same comment that underlying
>>> the bus-specifics dma-mapping functions are identical.
>>
>> I think you are right, the is_cpu_coherent check should not be needed here.
>> I still think we should have separate paths for the PCI/DMA cases though,
>> unless you can point me to a source that clearly states that the PCI API is
>> deprecated and that DMA should be used instead.
>
> Ah, on 2nd look I've found it again. Quoting
> Documentation/DMA-API-HOWTO.txt:
>
> "Note that the DMA API works with any bus independent of the underlying
> microprocessor architecture. You should use the DMA API rather than the
> bus-specific DMA API, i.e., use the dma_map_*() interfaces rather than the
> pci_map_*() interfaces."
>
> The advice is fairly strong here I think ;-) And imo the idea makes sense,
> since it allows drivers like nouveau here to care much less about the
> actual bus used to get data to/from the ip block. And if you look at intel
> gfx it makes even more sense since the pci layer we have is really just a
> thin fake shim whacked on top of the hw (on SoCs at least).

Indeed, I stand corrected. :) That's good news actually, as it will 
simplify the code. Thanks for pointing that out!

I will send a new revision that makes use of the DMA API exclusively and 
will remove the nv_device_map/unmap() functions which are pretty useless 
now.


[Bug 69723] GPU lockups with kernel 3.11.0 / 3.12-rc1 when dpm=1 on r600g (Cayman)

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=69723

--- Comment #123 from Marc  ---
I just got 1 lockup. The 1st in 10 days.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/98ead57b/attachment.html>


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Christian König
Am 11.07.2014 18:05, schrieb Jerome Glisse:
> On Fri, Jul 11, 2014 at 12:50:02AM +0300, Oded Gabbay wrote:
>> To support HSA on KV, we need to limit the number of vmids and pipes
>> that are available for radeon's use with KV.
>>
>> This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
>> 0-7) and also makes radeon thinks that KV has only a single MEC with a single
>> pipe in it
>>
>> Signed-off-by: Oded Gabbay 
> Reviewed-by: J?r?me Glisse 

At least fro the VMIDs on demand allocation should be trivial to 
implement, so I would rather prefer this instead of a fixed assignment.

Christian.

>
>> ---
>>   drivers/gpu/drm/radeon/cik.c | 48 
>> ++--
>>   1 file changed, 24 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
>> index 4bfc2c0..e0c8052 100644
>> --- a/drivers/gpu/drm/radeon/cik.c
>> +++ b/drivers/gpu/drm/radeon/cik.c
>> @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device *rdev)
>>  /*
>>   * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
>>   * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
>> + * Nonetheless, we assign only 1 pipe because all other pipes will
>> + * be handled by KFD
>>   */
>> -if (rdev->family == CHIP_KAVERI)
>> -rdev->mec.num_mec = 2;
>> -else
>> -rdev->mec.num_mec = 1;
>> -rdev->mec.num_pipe = 4;
>> +rdev->mec.num_mec = 1;
>> +rdev->mec.num_pipe = 1;
>>  rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;
>>   
>>  if (rdev->mec.hpd_eop_obj == NULL) {
>> @@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct 
>> radeon_device *rdev)
>>   
>>  /* init the pipes */
>>  mutex_lock(>srbm_mutex);
>> -for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
>> -int me = (i < 4) ? 1 : 2;
>> -int pipe = (i < 4) ? i : (i - 4);
>>   
>> -eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 
>> 2);
>> +eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;
>>   
>> -cik_srbm_select(rdev, me, pipe, 0, 0);
>> +cik_srbm_select(rdev, 0, 0, 0, 0);
>>   
>> -/* write the EOP addr */
>> -WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
>> -WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 
>> 8);
>> +/* write the EOP addr */
>> +WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
>> +WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8);
>>   
>> -/* set the VMID assigned */
>> -WREG32(CP_HPD_EOP_VMID, 0);
>> +/* set the VMID assigned */
>> +WREG32(CP_HPD_EOP_VMID, 0);
>> +
>> +/* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
>> +tmp = RREG32(CP_HPD_EOP_CONTROL);
>> +tmp &= ~EOP_SIZE_MASK;
>> +tmp |= order_base_2(MEC_HPD_SIZE / 8);
>> +WREG32(CP_HPD_EOP_CONTROL, tmp);
>>   
>> -/* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
>> -tmp = RREG32(CP_HPD_EOP_CONTROL);
>> -tmp &= ~EOP_SIZE_MASK;
>> -tmp |= order_base_2(MEC_HPD_SIZE / 8);
>> -WREG32(CP_HPD_EOP_CONTROL, tmp);
>> -}
>> -cik_srbm_select(rdev, 0, 0, 0, 0);
>>  mutex_unlock(>srbm_mutex);
>>   
>>  /* init the queues.  Just two for now. */
>> @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct 
>> radeon_ib *ib)
>>*/
>>   int cik_vm_init(struct radeon_device *rdev)
>>   {
>> -/* number of VMs */
>> -rdev->vm_manager.nvm = 16;
>> +/*
>> + * number of VMs
>> + * VMID 0 is reserved for Graphics
>> + * radeon compute will use VMIDs 1-7
>> + * KFD will use VMIDs 8-15
>> + */
>> +rdev->vm_manager.nvm = 8;
>>  /* base offset of vram pages */
>>  if (rdev->flags & RADEON_IS_IGP) {
>>  u64 tmp = RREG32(MC_VM_FB_OFFSET);
>> -- 
>> 1.9.1
>>



[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Bridgman, John


>-Original Message-
>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>Sent: Friday, July 11, 2014 1:04 PM
>To: Oded Gabbay
>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org; dri-
>devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew; Joerg
>Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon Vijay
>Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada; Santosh
>Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
>Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
>AMD's GPUs
>
>On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
>> This patch adds the code base of the hsa driver for
>> AMD's GPUs.
>>
>> This driver is called kfd.
>>
>> This initial version supports the first HSA chip, Kaveri.
>>
>> This driver is located in a new directory structure under drivers/gpu.
>>
>> Signed-off-by: Oded Gabbay 
>
>There is too coding style issues. While we have been lax on the enforcing the
>scripts/checkpatch.pl rules i think there is a limit to that. I am not strict
>on the 80chars per line but others things needs fixing so we stay inline.
>
>Also i am a bit worried about the license, given top comment in each of the
>files i am not sure this is GPL2 compatible. I would need to ask lawyer to
>review that.
>

Hi Jerome,

Which line in the license are you concerned about ? In theory we're using the 
same license as the initial code pushes for radeon, and I just did a side-by 
side compare with the license header on cik.c in the radeon tree and confirmed 
that the two licenses are identical. 

The cik.c header has an additional "Authors:" line which the kfd files do not, 
but AFAIK that is not part of the license text proper.

JB


[RFC PATCH] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-11 Thread Boris BREZILLON
On Fri, 11 Jul 2014 17:47:05 +0200
Boris BREZILLON  wrote:

> On Fri, 11 Jul 2014 11:41:12 -0400
> Rob Clark  wrote:
> 
> > On Fri, Jul 11, 2014 at 11:17 AM, Boris BREZILLON
> >  wrote:
> > > Make use of lists instead of kfifo in order to dynamically allocate
> > > task entry when someone require some delayed work, and thus preventing
> > > drm_flip_work_queue from directly calling func instead of queuing this
> > > call.
> > > This allow drm_flip_work_queue to be safely called even within irq
> > > handlers.
> > >
> > > Add new helper functions to allocate a flip work task and queue it when
> > > needed. This prevents allocating data within irq context (which might
> > > impact the time spent in the irq handler).
> > >
> > > Signed-off-by: Boris BREZILLON 
> > > ---
> > > Hi Rob,
> > >
> > > This is a proposal for what you suggested (dynamically growing the drm
> > > flip work queue in order to avoid direct call of work->func when calling
> > > drm_flip_work_queue).
> > >
> > > I'm not sure this is exactly what you expected, because I'm now using
> > > lists instead of kfifo (and thus lose the lockless part), but at least
> > > we can now safely call drm_flip_work_queue or drm_flip_work_queue_task
> > > from irq handlers :-).
> > >
> > > You were also worried about queueing the same framebuffer multiple times
> > > and with this implementation you shouldn't have any problem (at least with
> > > drm_flip_work_queue, what people do with drm_flip_work_queue_task is their
> > > own responsability, but they should allocate one task for each operation
> > > even if they are manipulating the same framebuffer).
> > 
> > yeah, if we are dynamically allocating the list nodes, that solves the
> > queuing-up-multiple-times issue..
> > 
> > I wonder if drm_flip_work_allocate_task() should use GPF_ATOMIC when
> > allocating?
> 
> That's funny, I was actually modifying the API to pass gfp_t flags to
> this function ;-)
> 
> > I guess maybe it is possible to pre-allocate the task
> > from non-irq context, and then queue it from irq context.. it makes
> > the API a bit more complex, but there are only a couple users
> > currently, so I suppose this should be doable.
> 
> I tried to keep the existing API so that existing users won't see the
> difference (I guess none of them are calling drm_flip_work_queue).

Some words are missing :-):

(I guess none of them are calling drm_flip_work_queue from irq
handlers).

> 
> I just added the drm_flip_work_allocate_task and
> drm_flip_work_queue_task for those who want more control on the
> queuing process.
> 
> Best Regards,
> 
> Boris
> 
> 
> 
> 
> 



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


[PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking srbm_gfx_cntl register

2014-07-11 Thread Bridgman, John
Checking... we shouldn't need to call the lock from kfd any more.We should be 
able to do any required locking in radeon kgd code.

>-Original Message-
>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>Sent: Friday, July 11, 2014 12:35 PM
>To: Oded Gabbay
>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org; dri-
>devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew; Joerg
>Roedel; Gabbay, Oded; Koenig, Christian
>Subject: Re: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking
>srbm_gfx_cntl register
>
>On Fri, Jul 11, 2014 at 12:50:07AM +0300, Oded Gabbay wrote:
>> This patch adds a new interface to kfd2kgd_calls structure, which
>> allows the kfd to lock and unlock the srbm_gfx_cntl register
>
>Why does kfd needs to lock this register if kfd can not access any of those
>register ? This sounds broken to me, exposing a driver internal mutex to
>another driver is not something i am fan of.
>
>Cheers,
>J?r?me
>
>>
>> Signed-off-by: Oded Gabbay 
>> ---
>>  drivers/gpu/drm/radeon/radeon_kfd.c | 20 
>>  include/linux/radeon_kfd.h  |  4 
>>  2 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c
>> b/drivers/gpu/drm/radeon/radeon_kfd.c
>> index 66ee36b..594020e 100644
>> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
>> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
>> @@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd, struct
>> kgd_mem *mem);
>>
>>  static uint64_t get_vmem_size(struct kgd_dev *kgd);
>>
>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); static void
>> +unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
>> +
>> +
>>  static const struct kfd2kgd_calls kfd2kgd = {
>>  .allocate_mem = allocate_mem,
>>  .free_mem = free_mem,
>> @@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>  .kmap_mem = kmap_mem,
>>  .unkmap_mem = unkmap_mem,
>>  .get_vmem_size = get_vmem_size,
>> +.lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
>> +.unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
>>  };
>>
>>  static const struct kgd2kfd_calls *kgd2kfd; @@ -233,3 +239,17 @@
>> static uint64_t get_vmem_size(struct kgd_dev *kgd)
>>
>>  return rdev->mc.real_vram_size;
>>  }
>> +
>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>> +struct radeon_device *rdev = (struct radeon_device *)kgd;
>> +
>> +mutex_lock(>srbm_mutex);
>> +}
>> +
>> +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>> +struct radeon_device *rdev = (struct radeon_device *)kgd;
>> +
>> +mutex_unlock(>srbm_mutex);
>> +}
>> diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
>> index c7997d4..40b691c 100644
>> --- a/include/linux/radeon_kfd.h
>> +++ b/include/linux/radeon_kfd.h
>> @@ -81,6 +81,10 @@ struct kfd2kgd_calls {
>>  void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
>>
>>  uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
>> +
>> +/* SRBM_GFX_CNTL mutex */
>> +void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>> +void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>>  };
>>
>>  bool kgd2kfd_init(unsigned interface_version,
>> --
>> 1.9.1
>>


[RFC PATCH] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-11 Thread Boris BREZILLON
On Fri, 11 Jul 2014 11:41:12 -0400
Rob Clark  wrote:

> On Fri, Jul 11, 2014 at 11:17 AM, Boris BREZILLON
>  wrote:
> > Make use of lists instead of kfifo in order to dynamically allocate
> > task entry when someone require some delayed work, and thus preventing
> > drm_flip_work_queue from directly calling func instead of queuing this
> > call.
> > This allow drm_flip_work_queue to be safely called even within irq
> > handlers.
> >
> > Add new helper functions to allocate a flip work task and queue it when
> > needed. This prevents allocating data within irq context (which might
> > impact the time spent in the irq handler).
> >
> > Signed-off-by: Boris BREZILLON 
> > ---
> > Hi Rob,
> >
> > This is a proposal for what you suggested (dynamically growing the drm
> > flip work queue in order to avoid direct call of work->func when calling
> > drm_flip_work_queue).
> >
> > I'm not sure this is exactly what you expected, because I'm now using
> > lists instead of kfifo (and thus lose the lockless part), but at least
> > we can now safely call drm_flip_work_queue or drm_flip_work_queue_task
> > from irq handlers :-).
> >
> > You were also worried about queueing the same framebuffer multiple times
> > and with this implementation you shouldn't have any problem (at least with
> > drm_flip_work_queue, what people do with drm_flip_work_queue_task is their
> > own responsability, but they should allocate one task for each operation
> > even if they are manipulating the same framebuffer).
> 
> yeah, if we are dynamically allocating the list nodes, that solves the
> queuing-up-multiple-times issue..
> 
> I wonder if drm_flip_work_allocate_task() should use GPF_ATOMIC when
> allocating?

That's funny, I was actually modifying the API to pass gfp_t flags to
this function ;-)

> I guess maybe it is possible to pre-allocate the task
> from non-irq context, and then queue it from irq context.. it makes
> the API a bit more complex, but there are only a couple users
> currently, so I suppose this should be doable.

I tried to keep the existing API so that existing users won't see the
difference (I guess none of them are calling drm_flip_work_queue).

I just added the drm_flip_work_allocate_task and
drm_flip_work_queue_task for those who want more control on the
queuing process.

Best Regards,

Boris





-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


[PATCH 00/83] AMD HSA kernel driver

2014-07-11 Thread Jerome Glisse
On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
> On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
> > On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
> > >  This patch set implements a Heterogeneous System Architecture 
> > > (HSA) driver
> > >  for radeon-family GPUs.
> >  
> > This is just quick comments on few things. Given size of this, people
> > will need to have time to review things.
> >  
> > >  HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to 
> > > share
> > >  system resources more effectively via HW features including 
> > > shared pageable
> > >  memory, userspace-accessible work queues, and platform-level 
> > > atomics. In
> > >  addition to the memory protection mechanisms in GPUVM and 
> > > IOMMUv2, the Sea
> > >  Islands family of GPUs also performs HW-level validation of 
> > > commands passed
> > >  in through the queues (aka rings).
> > >  The code in this patch set is intended to serve both as a sample 
> > > driver for
> > >  other HSA-compatible hardware devices and as a production driver 
> > > for
> > >  radeon-family processors. The code is architected to support 
> > > multiple CPUs
> > >  each with connected GPUs, although the current implementation 
> > > focuses on a
> > >  single Kaveri/Berlin APU, and works alongside the existing radeon 
> > > kernel
> > >  graphics driver (kgd).
> > >  AMD GPUs designed for use with HSA (Sea Islands and up) share 
> > > some hardware
> > >  functionality between HSA compute and regular gfx/compute (memory,
> > >  interrupts, registers), while other functionality has been added
> > >  specifically for HSA compute  (hw scheduler for virtualized 
> > > compute rings).
> > >  All shared hardware is owned by the radeon graphics driver, and 
> > > an interface
> > >  between kfd and kgd allows the kfd to make use of those shared 
> > > resources,
> > >  while HSA-specific functionality is managed directly by kfd by 
> > > submitting
> > >  packets into an HSA-specific command queue (the "HIQ").
> > >  During kfd module initialization a char device node (/dev/kfd) is 
> > > created
> > >  (surviving until module exit), with ioctls for queue creation & 
> > > management,
> > >  and data structures are initialized for managing HSA device 
> > > topology.
> > >  The rest of the initialization is driven by calls from the radeon 
> > > kgd at
> > >  the following points :
> > >  - radeon_init (kfd_init)
> > >  - radeon_exit (kfd_fini)
> > >  - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
> > >  - radeon_driver_unload_kms (kfd_device_fini)
> > >  During the probe and init processing per-device data structures 
> > > are
> > >  established which connect to the associated graphics kernel 
> > > driver. This
> > >  information is exposed to userspace via sysfs, along with a 
> > > version number
> > >  allowing userspace to determine if a topology change has occurred 
> > > while it
> > >  was reading from sysfs.
> > >  The interface between kfd and kgd also allows the kfd to request 
> > > buffer
> > >  management services from kgd, and allows kgd to route interrupt 
> > > requests to
> > >  kfd code since the interrupt block is shared between regular
> > >  graphics/compute and HSA compute subsystems in the GPU.
> > >  The kfd code works with an open source usermode library 
> > > ("libhsakmt") which
> > >  is in the final stages of IP review and should be published in a 
> > > separate
> > >  repo over the next few days.
> > >  The code operates in one of three modes, selectable via the 
> > > sched_policy
> > >  module parameter :
> > >  - sched_policy=0 uses a hardware scheduler running in the MEC 
> > > block within
> > >  CP, and allows oversubscription (more queues than HW slots)
> > >  - sched_policy=1 also uses HW scheduling but does not allow
> > >  oversubscription, so create_queue requests fail when we run out 
> > > of HW slots
> > >  - sched_policy=2 does not use HW scheduling, so the driver 
> > > manually assigns
> > >  queues to HW slots by programming registers
> > >  The "no HW scheduling" option is for debug & new hardware bringup 
> > > only, so
> > >  has less test coverage than the other options. Default in the 
> > > current code
> > >  is "HW scheduling without oversubscription" since that is where 
> > > we have the
> > >  most test coverage but we expect to change the default to "HW 
> > > scheduling
> > >  with oversubscription" after further testing. This effectively 
> > > removes the
> > >  HW limit on the number of work queues available to applications.
> > >  Programs running on the GPU are associated with an address space 
> > > through the
> > >  VMID field, which is translated to a unique PASID at access time 
> > > via a set
> > >  of 16 VMID-to-PASID mapping registers. The available VMIDs 
> > > (currently 16)
> > >  are partitioned (under control of the radeon kgd) between current
> > >  gfx/compute and HSA compute, with each getting 8 in the 

[PATCH RFC 06/15] drm/armada: move variant initialisation to CRTC init

2014-07-11 Thread Sebastian Hesselbarth
On 07/11/2014 04:37 PM, Russell King - ARM Linux wrote:
> On Sat, Jul 05, 2014 at 01:58:37PM +0200, Sebastian Hesselbarth wrote:
>> On 07/05/2014 12:38 PM, Russell King wrote:
>>> Move the variant initialisation entirely to the CRTC init function -
>>> the variant support is really about the CRTC properties than the whole
>>> system, and we want to treat each CRTC individually when we support DT.
>>>
>>> Signed-off-by: Russell King 
>>> ---
>> [...]
>>> diff --git a/drivers/gpu/drm/armada/armada_crtc.h 
>>> b/drivers/gpu/drm/armada/armada_crtc.h
>>> index 531a9b0bdcfb..3f0e70bb2e9c 100644
>>> --- a/drivers/gpu/drm/armada/armada_crtc.h
>>> +++ b/drivers/gpu/drm/armada/armada_crtc.h
>>> @@ -38,6 +38,7 @@ struct armada_crtc {
>>> unsignednum;
>>> void __iomem*base;
>>> struct clk  *clk;
>>> +   struct clk  *extclk[2];
>>
>> Russell,
>>
>> I wonder, if we should rename above array srcclk instead of extclk
>> while moving it anyway. That way we can use it for the other variant
>> specific clocks, too.
>
> As the patches are prepared with this change, I'd prefer to submit them
> as-is, and then we can update that as and when the support for things
> like the MMP/610 is finished off.  I think they're good to go, so I'll
> send them off later today to David.

Ok, sounds fine to me.

> This leaves the TDA998x componentisation patches which I need to kick
> out, and the initial DT changes.  Once those are in place, we should
> have almost all ducks lined up for working DRM support - it'll certainly
> be advanced enough to describe the LCD controllers and the TDA998x as
> three separate DT entities using the of graph helpers.

Ok.

> What's left is the display-subsystem { } entity to describe the makeup
> of the subsystem.  That's not included as we currently need to pass
> a block of memory, and the DT support for reserving chunks of memory
> appeared (last time I looked) to only be botch-merged (only half of it
> seems to have been merged making the whole reserved memory thing
> totally useless - why people only half-merge features I've no idea.)

There was a follow-up patch set for this some days ago
http://comments.gmane.org/gmane.linux.ports.arm.kernel/337686

Sebastian



[RFC PATCH] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-11 Thread Boris BREZILLON
Make use of lists instead of kfifo in order to dynamically allocate
task entry when someone require some delayed work, and thus preventing
drm_flip_work_queue from directly calling func instead of queuing this
call.
This allow drm_flip_work_queue to be safely called even within irq
handlers.

Add new helper functions to allocate a flip work task and queue it when
needed. This prevents allocating data within irq context (which might
impact the time spent in the irq handler).

Signed-off-by: Boris BREZILLON 
---
Hi Rob,

This is a proposal for what you suggested (dynamically growing the drm
flip work queue in order to avoid direct call of work->func when calling
drm_flip_work_queue).

I'm not sure this is exactly what you expected, because I'm now using
lists instead of kfifo (and thus lose the lockless part), but at least
we can now safely call drm_flip_work_queue or drm_flip_work_queue_task
from irq handlers :-).

You were also worried about queueing the same framebuffer multiple times
and with this implementation you shouldn't have any problem (at least with
drm_flip_work_queue, what people do with drm_flip_work_queue_task is their
own responsability, but they should allocate one task for each operation
even if they are manipulating the same framebuffer).

This is just a suggestion, so don't hesitate to tell me that it doesn't
match your expectations.

Best Regards,

Boris

 drivers/gpu/drm/drm_flip_work.c | 95 ++---
 include/drm/drm_flip_work.h | 29 +
 2 files changed, 92 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index f9c7fa3..21d5715 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -25,6 +25,43 @@
 #include "drm_flip_work.h"

 /**
+ * drm_flip_work_allocate_task - allocate a flip-work task
+ * @data: data associated to the task
+ *
+ * Allocate a drm_flip_task object and attach private data to it.
+ */
+struct drm_flip_task *drm_flip_work_allocate_task(void *data)
+{
+   struct drm_flip_task *task;
+
+   task = kzalloc(sizeof(*task), GFP_KERNEL);
+   if (task)
+   task->data = data;
+
+   return task;
+}
+EXPORT_SYMBOL(drm_flip_work_allocate_task);
+
+/**
+ * drm_flip_work_queue_task - queue a specific task
+ * @work: the flip-work
+ * @task: the task to handle
+ *
+ * Queues task, that will later be run (passed back to drm_flip_func_t
+ * func) on a work queue after drm_flip_work_commit() is called.
+ */
+void drm_flip_work_queue_task(struct drm_flip_work *work,
+ struct drm_flip_task *task)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   list_add_tail(>node, >queued);
+   spin_unlock_irqrestore(>lock, flags);
+}
+EXPORT_SYMBOL(drm_flip_work_queue_task);
+
+/**
  * drm_flip_work_queue - queue work
  * @work: the flip-work
  * @val: the value to queue
@@ -34,10 +71,14 @@
  */
 void drm_flip_work_queue(struct drm_flip_work *work, void *val)
 {
-   if (kfifo_put(>fifo, val)) {
-   atomic_inc(>pending);
+   struct drm_flip_task *task;
+
+   task = kzalloc(sizeof(*task), GFP_KERNEL);
+   if (task) {
+   task->data = val;
+   drm_flip_work_queue_task(work, task);
} else {
-   DRM_ERROR("%s fifo full!\n", work->name);
+   DRM_ERROR("%s could not allocate task!\n", work->name);
work->func(work, val);
}
 }
@@ -56,9 +97,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
 void drm_flip_work_commit(struct drm_flip_work *work,
struct workqueue_struct *wq)
 {
-   uint32_t pending = atomic_read(>pending);
-   atomic_add(pending, >count);
-   atomic_sub(pending, >pending);
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   list_splice_tail(>queued, >commited);
+   INIT_LIST_HEAD(>queued);
+   spin_unlock_irqrestore(>lock, flags);
queue_work(wq, >worker);
 }
 EXPORT_SYMBOL(drm_flip_work_commit);
@@ -66,14 +110,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
 static void flip_worker(struct work_struct *w)
 {
struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
worker);
-   uint32_t count = atomic_read(>count);
-   void *val = NULL;
+   struct list_head tasks;
+   unsigned long flags;

-   atomic_sub(count, >count);
+   while (1) {
+   struct drm_flip_task *task, *tmp;

-   while(count--)
-   if (!WARN_ON(!kfifo_get(>fifo, )))
-   work->func(work, val);
+   INIT_LIST_HEAD();
+   spin_lock_irqsave(>lock, flags);
+   list_splice_tail(>commited, );
+   INIT_LIST_HEAD(>commited);
+   spin_unlock_irqrestore(>lock, flags);
+
+   if (list_empty())
+   break;
+
+   list_for_each_entry_safe(task, tmp, , node) {
+ 

[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Bridgman, John


>-Original Message-
>From: dri-devel [mailto:dri-devel-bounces at lists.freedesktop.org] On Behalf
>Of Alex Deucher
>Sent: Friday, July 11, 2014 12:23 PM
>To: Koenig, Christian
>Cc: Oded Gabbay; Lewycky, Andrew; LKML; Maling list - DRI developers;
>Deucher, Alexander
>Subject: Re: [PATCH 02/83] drm/radeon: reduce number of free VMIDs and
>pipes in KV
>
>On Fri, Jul 11, 2014 at 12:18 PM, Christian K?nig 
>wrote:
>> Am 11.07.2014 18:05, schrieb Jerome Glisse:
>>
>>> On Fri, Jul 11, 2014 at 12:50:02AM +0300, Oded Gabbay wrote:

 To support HSA on KV, we need to limit the number of vmids and pipes
 that are available for radeon's use with KV.

 This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
 0-7) and also makes radeon thinks that KV has only a single MEC with
 a single pipe in it

 Signed-off-by: Oded Gabbay 
>>>
>>> Reviewed-by: J?r?me Glisse 
>>
>>
>> At least fro the VMIDs on demand allocation should be trivial to
>> implement, so I would rather prefer this instead of a fixed assignment.
>
>IIRC, the way the CP hw scheduler works you have to give it a range of vmids
>and it assigns them dynamically as queues are mapped so effectively they
>are potentially in use once the CP scheduler is set up.
>
>Alex

Right. The SET_RESOURCES packet (kfd_pm4_headers.h, added in patch 49) 
allocates a range of HW queues, VMIDs and GDS to the HW scheduler, then the 
scheduler uses the allocated VMIDs to support a potentially larger number of 
user processes by dynamically mapping PASIDs to VMIDs and memory queue 
descriptors (MQDs) to HW queues.

BTW Oded I think we have some duplicated defines at the end of 
kfd_pm4_headers.h, if they are really duplicates it would be great to remove 
those before the pull request.

Thanks,
JB

>
>
>>
>> Christian.
>>
>>
>>>
 ---
   drivers/gpu/drm/radeon/cik.c | 48
 ++--
   1 file changed, 24 insertions(+), 24 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/cik.c
 b/drivers/gpu/drm/radeon/cik.c index 4bfc2c0..e0c8052 100644
 --- a/drivers/gpu/drm/radeon/cik.c
 +++ b/drivers/gpu/drm/radeon/cik.c
 @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device
 *rdev)
 /*
  * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
  * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues
 total
 +* Nonetheless, we assign only 1 pipe because all other
 + pipes
 will
 +* be handled by KFD
  */
 -   if (rdev->family == CHIP_KAVERI)
 -   rdev->mec.num_mec = 2;
 -   else
 -   rdev->mec.num_mec = 1;
 -   rdev->mec.num_pipe = 4;
 +   rdev->mec.num_mec = 1;
 +   rdev->mec.num_pipe = 1;
 rdev->mec.num_queue = rdev->mec.num_mec * rdev-
>>mec.num_pipe * 8;
 if (rdev->mec.hpd_eop_obj == NULL) { @@ -4809,28 +4808,24 @@
 static int cik_cp_compute_resume(struct radeon_device *rdev)
 /* init the pipes */
 mutex_lock(>srbm_mutex);
 -   for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
 -   int me = (i < 4) ? 1 : 2;
 -   int pipe = (i < 4) ? i : (i - 4);
   - eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i *
 MEC_HPD_SIZE * 2);
 +   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;
   - cik_srbm_select(rdev, me, pipe, 0, 0);
 +   cik_srbm_select(rdev, 0, 0, 0, 0);
   - /* write the EOP addr */
 -   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
 -   WREG32(CP_HPD_EOP_BASE_ADDR_HI,
 upper_32_bits(eop_gpu_addr) >> 8);
 +   /* write the EOP addr */
 +   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
 +   WREG32(CP_HPD_EOP_BASE_ADDR_HI,
>upper_32_bits(eop_gpu_addr)
 + >>
 8);
   - /* set the VMID assigned */
 -   WREG32(CP_HPD_EOP_VMID, 0);
 +   /* set the VMID assigned */
 +   WREG32(CP_HPD_EOP_VMID, 0);
 +
 +   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
 +   tmp = RREG32(CP_HPD_EOP_CONTROL);
 +   tmp &= ~EOP_SIZE_MASK;
 +   tmp |= order_base_2(MEC_HPD_SIZE / 8);
 +   WREG32(CP_HPD_EOP_CONTROL, tmp);
   - /* set the EOP size, register value is 2^(EOP_SIZE+1)
 dwords */
 -   tmp = RREG32(CP_HPD_EOP_CONTROL);
 -   tmp &= ~EOP_SIZE_MASK;
 -   tmp |= order_base_2(MEC_HPD_SIZE / 8);
 -   WREG32(CP_HPD_EOP_CONTROL, tmp);
 -   }
 -   cik_srbm_select(rdev, 0, 0, 0, 0);
 mutex_unlock(>srbm_mutex);
 /* init the queues.  Just two for now. */ @@ -5876,8
 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct
 radeon_ib *ib)

[PATCH 13/83] hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:13AM +0300, Oded Gabbay wrote:
> This patch adds 2 new IOCTL to kfd driver.
> 
> The first IOCTL is KFD_IOC_CREATE_QUEUE that is used by the user-mode
> application to create a compute queue on the GPU.
> 
> The second IOCTL is KFD_IOC_DESTROY_QUEUE that is used by the
> user-mode application to destroy an existing compute queue on the GPU.
> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/hsa/radeon/kfd_chardev.c  | 155 
> ++
>  drivers/gpu/hsa/radeon/kfd_doorbell.c |  11 +++
>  include/uapi/linux/kfd_ioctl.h|  69 +++
>  3 files changed, 235 insertions(+)
>  create mode 100644 include/uapi/linux/kfd_ioctl.h
> 
> diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
> b/drivers/gpu/hsa/radeon/kfd_chardev.c
> index 0b5bc74..4e7d5d0 100644
> --- a/drivers/gpu/hsa/radeon/kfd_chardev.c
> +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
> @@ -27,11 +27,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "kfd_priv.h"
>  #include "kfd_scheduler.h"
>  
>  static long kfd_ioctl(struct file *, unsigned int, unsigned long);
>  static int kfd_open(struct inode *, struct file *);
> +static int kfd_mmap(struct file *, struct vm_area_struct *);
>  
>  static const char kfd_dev_name[] = "kfd";
>  
> @@ -108,17 +110,170 @@ kfd_open(struct inode *inode, struct file *filep)
>   return 0;
>  }
>  
> +static long
> +kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void 
> __user *arg)
> +{
> + struct kfd_ioctl_create_queue_args args;
> + struct kfd_dev *dev;
> + int err = 0;
> + unsigned int queue_id;
> + struct kfd_queue *queue;
> + struct kfd_process_device *pdd;
> +
> + if (copy_from_user(, arg, sizeof(args)))
> + return -EFAULT;
> +
> + dev = radeon_kfd_device_by_id(args.gpu_id);
> + if (dev == NULL)
> + return -EINVAL;
> +
> + queue = kzalloc(
> + offsetof(struct kfd_queue, scheduler_queue) + 
> dev->device_info->scheduler_class->queue_size,
> + GFP_KERNEL);
> +
> + if (!queue)
> + return -ENOMEM;
> +
> + queue->dev = dev;
> +
> + mutex_lock(>mutex);
> +
> + pdd = radeon_kfd_bind_process_to_device(dev, p);
> + if (IS_ERR(pdd) < 0) {
> + err = PTR_ERR(pdd);
> + goto err_bind_pasid;
> + }
> +
> + pr_debug("kfd: creating queue number %d for PASID %d on GPU 0x%x\n",
> + pdd->queue_count,
> + p->pasid,
> + dev->id);
> +
> + if (pdd->queue_count++ == 0) {
> + err = 
> dev->device_info->scheduler_class->register_process(dev->scheduler, p, 
> >scheduler_process);
> + if (err < 0)
> + goto err_register_process;
> + }
> +
> + if (!radeon_kfd_allocate_queue_id(p, _id))
> + goto err_allocate_queue_id;
> +
> + err = dev->device_info->scheduler_class->create_queue(dev->scheduler, 
> pdd->scheduler_process,
> +   
> >scheduler_queue,
> +   (void __user 
> *)args.ring_base_address,
> +   args.ring_size,
> +   (void __user 
> *)args.read_pointer_address,
> +   (void __user 
> *)args.write_pointer_address,
> +   
> radeon_kfd_queue_id_to_doorbell(dev, p, queue_id));
> + if (err)
> + goto err_create_queue;
> +
> + radeon_kfd_install_queue(p, queue_id, queue);
> +
> + args.queue_id = queue_id;
> + args.doorbell_address = 
> (uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id);
> +
> + if (copy_to_user(arg, , sizeof(args))) {
> + err = -EFAULT;
> + goto err_copy_args_out;
> + }
> +
> + mutex_unlock(>mutex);
> +
> + pr_debug("kfd: queue id %d was created successfully.\n"
> +  " ring buffer address == 0x%016llX\n"
> +  " read ptr address== 0x%016llX\n"
> +  " write ptr address   == 0x%016llX\n"
> +  " doorbell address== 0x%016llX\n",
> + args.queue_id,
> + args.ring_base_address,
> + args.read_pointer_address,
> + args.write_pointer_address,
> + args.doorbell_address);
> +
> + return 0;
> +
> +err_copy_args_out:
> + dev->device_info->scheduler_class->destroy_queue(dev->scheduler, 
> >scheduler_queue);
> +err_create_queue:
> + radeon_kfd_remove_queue(p, queue_id);
> +err_allocate_queue_id:
> + if (--pdd->queue_count == 0) {
> + 
> dev->device_info->scheduler_class->deregister_process(dev->scheduler, 
> pdd->scheduler_process);
> + 

[PATCH 44/83] hsa/radeon: HSA64/HSA32 modes support

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:54:00AM +0300, Oded Gabbay wrote:
> From: Alexey Skidanov 
> 
> Added apertures initialization and appropriate ioctl

What is process aperture and what it is use for ? This is a very
cryptic commit message.

Cheers,
J?r?me

> 
> Signed-off-by: Alexey Skidanov 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/hsa/radeon/Makefile   |   2 +-
>  drivers/gpu/hsa/radeon/kfd_aperture.c | 124 
> ++
>  drivers/gpu/hsa/radeon/kfd_chardev.c  |  58 +++-
>  drivers/gpu/hsa/radeon/kfd_priv.h |  18 
>  drivers/gpu/hsa/radeon/kfd_process.c  |  17 
>  drivers/gpu/hsa/radeon/kfd_sched_cik_static.c |   3 +-
>  drivers/gpu/hsa/radeon/kfd_topology.c |  27 ++
>  include/uapi/linux/kfd_ioctl.h|  18 
>  8 files changed, 264 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_aperture.c
> 
> diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
> index 5422e6a..813b31f 100644
> --- a/drivers/gpu/hsa/radeon/Makefile
> +++ b/drivers/gpu/hsa/radeon/Makefile
> @@ -5,6 +5,6 @@
>  radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \
>   kfd_pasid.o kfd_topology.o kfd_process.o \
>   kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
> - kfd_vidmem.o kfd_interrupt.o
> + kfd_vidmem.o kfd_interrupt.o kfd_aperture.o
>  
>  obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o
> diff --git a/drivers/gpu/hsa/radeon/kfd_aperture.c 
> b/drivers/gpu/hsa/radeon/kfd_aperture.c
> new file mode 100644
> index 000..9e2d6da
> --- /dev/null
> +++ b/drivers/gpu/hsa/radeon/kfd_aperture.c
> @@ -0,0 +1,124 @@
> +/*
> + * Copyright 2014 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "kfd_priv.h"
> +#include "kfd_scheduler.h"
> +#include 
> +#include 
> +#include 
> +
> +
> +#define MAKE_GPUVM_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 
> 0x1)
> +#define MAKE_GPUVM_APP_LIMIT(base) (((uint64_t)(base) & 0xFF00) 
> | 0xFF)
> +#define MAKE_SCRATCH_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 
> 0x1)
> +#define MAKE_SCRATCH_APP_LIMIT(base) (((uint64_t)base & 0x) 
> | 0x)
> +#define MAKE_LDS_APP_BASE(gpu_num) (((uint64_t)(gpu_num) << 61) + 0x0)
> +#define MAKE_LDS_APP_LIMIT(base) (((uint64_t)(base) & 0x) | 
> 0x)
> +
> +#define HSA_32BIT_LDS_APP_SIZE 0x1
> +#define HSA_32BIT_LDS_APP_ALIGNMENT 0x1
> +
> +static unsigned long kfd_reserve_aperture(struct kfd_process *process, 
> unsigned long len, unsigned long alignment)
> +{
> +
> + unsigned long addr = 0;
> + unsigned long start_address;
> +
> + /*
> +  * Go bottom up and find the first available aligned address.
> +  * We may narrow space to scan by getting mmap range limits.
> +  */
> + for (start_address =  alignment; start_address < (TASK_SIZE - 
> alignment); start_address += alignment) {
> + addr = vm_mmap(NULL, start_address, len, PROT_NONE, MAP_PRIVATE 
> | MAP_ANONYMOUS, 0);
> + if (!IS_ERR_VALUE(addr)) {
> + if (addr == start_address)
> + return addr;
> + vm_munmap(addr, len);
> + }
> + }
> + return 0;
> +
> +}
> +
> +int kfd_init_apertures(struct kfd_process *process)
> +{
> + uint8_t id  = 0;
> + struct kfd_dev *dev;
> + struct kfd_process_device *pdd;
> +
> + mutex_lock(>mutex);
> +
> + /*Iterating over all devices*/
> + while ((dev = kfd_topology_enum_kfd_devices(id)) != NULL && id < 
> NUM_OF_SUPPORTED_GPUS) {
> +
> + pdd 

[PATCH 32/83] hsa/radeon: implementing IOCTL for clock counters

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:53:48AM +0300, Oded Gabbay wrote:
> From: Evgeny Pinchuk 
> 
> Implemented new IOCTL to query the CPU and GPU clock counters.
> 
> Signed-off-by: Evgeny Pinchuk 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/hsa/radeon/kfd_chardev.c | 37 
> 
>  include/uapi/linux/kfd_ioctl.h   |  9 +
>  2 files changed, 46 insertions(+)
> 
> diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
> b/drivers/gpu/hsa/radeon/kfd_chardev.c
> index ddaf357..d6fa980 100644
> --- a/drivers/gpu/hsa/radeon/kfd_chardev.c
> +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "kfd_priv.h"
>  #include "kfd_scheduler.h"
>  
> @@ -284,6 +285,38 @@ out:
>   return err;
>  }
>  
> +static long
> +kfd_ioctl_get_clock_counters(struct file *filep, struct kfd_process *p, void 
> __user *arg)
> +{
> + struct kfd_ioctl_get_clock_counters_args args;
> + struct kfd_dev *dev;
> + struct timespec time;
> +
> + if (copy_from_user(, arg, sizeof(args)))
> + return -EFAULT;
> +
> + dev = radeon_kfd_device_by_id(args.gpu_id);
> + if (dev == NULL)
> + return -EINVAL;
> +
> + /* Reading GPU clock counter from KGD */
> + args.gpu_clock_counter = kfd2kgd->get_gpu_clock_counter(dev->kgd);
> +
> + /* No access to rdtsc. Using raw monotonic time */
> + getrawmonotonic();
> + args.cpu_clock_counter = time.tv_nsec;

Is the GPU clock counter monotonic too ? Even after GPU reset (hard reset
included) what could go wrong if it rolls back ?

> +
> + get_monotonic_boottime();
> + args.system_clock_counter = time.tv_nsec;
> +
> + /* Since the counter is in nano-seconds we use 1GHz frequency */
> + args.system_clock_freq = 10;
> +
> + if (copy_to_user(arg, , sizeof(args)))
> + return -EFAULT;
> +
> + return 0;
> +}
>  
>  static long
>  kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> @@ -312,6 +345,10 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
> long arg)
>   err = kfd_ioctl_set_memory_policy(filep, process, (void __user 
> *)arg);
>   break;
>  
> + case KFD_IOC_GET_CLOCK_COUNTERS:
> + err = kfd_ioctl_get_clock_counters(filep, process, (void __user 
> *)arg);
> + break;
> +
>   default:
>   dev_err(kfd_device,
>   "unknown ioctl cmd 0x%x, arg 0x%lx)\n",
> diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
> index 928e628..5b9517e 100644
> --- a/include/uapi/linux/kfd_ioctl.h
> +++ b/include/uapi/linux/kfd_ioctl.h
> @@ -70,12 +70,21 @@ struct kfd_ioctl_set_memory_policy_args {
>   uint64_t alternate_aperture_size;   /* to KFD */
>  };
>  
> +struct kfd_ioctl_get_clock_counters_args {
> + uint32_t gpu_id;/* to KFD */
> + uint64_t gpu_clock_counter; /* from KFD */
> + uint64_t cpu_clock_counter; /* from KFD */
> + uint64_t system_clock_counter;  /* from KFD */
> + uint64_t system_clock_freq; /* from KFD */
> +};
> +
>  #define KFD_IOC_MAGIC 'K'
>  
>  #define KFD_IOC_GET_VERSION  _IOR(KFD_IOC_MAGIC, 1, struct 
> kfd_ioctl_get_version_args)
>  #define KFD_IOC_CREATE_QUEUE _IOWR(KFD_IOC_MAGIC, 2, struct 
> kfd_ioctl_create_queue_args)
>  #define KFD_IOC_DESTROY_QUEUE_IOWR(KFD_IOC_MAGIC, 3, struct 
> kfd_ioctl_destroy_queue_args)
>  #define KFD_IOC_SET_MEMORY_POLICY_IOW(KFD_IOC_MAGIC, 4, struct 
> kfd_ioctl_set_memory_policy_args)
> +#define KFD_IOC_GET_CLOCK_COUNTERS   _IOWR(KFD_IOC_MAGIC, 5, struct 
> kfd_ioctl_get_clock_counters_args)
>  
>  #pragma pack(pop)
>  
> -- 
> 1.9.1
> 


[Bug 79051] Panic with radeon hd 5750, bisected

2014-07-11 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=79051

Jonathan Howard  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |PATCH_ALREADY_AVAILABLE

--- Comment #3 from Jonathan Howard  ---
3.16-rc4 3.15.5 both working. Expect (unchecked) discussion (above) patch is
applied.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 79071] Hang with dpm radeon hd 5750, pcie 1.1 motherboard

2014-07-11 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=79071

Jonathan Howard  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #5 from Jonathan Howard  ---
Patched and working
3.16-rc4 3.15.5 3.14.12

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[PATCH RFC 06/15] drm/armada: move variant initialisation to CRTC init

2014-07-11 Thread Russell King - ARM Linux
On Fri, Jul 11, 2014 at 05:18:50PM +0200, Sebastian Hesselbarth wrote:
> On 07/11/2014 04:37 PM, Russell King - ARM Linux wrote:
>> What's left is the display-subsystem { } entity to describe the makeup
>> of the subsystem.  That's not included as we currently need to pass
>> a block of memory, and the DT support for reserving chunks of memory
>> appeared (last time I looked) to only be botch-merged (only half of it
>> seems to have been merged making the whole reserved memory thing
>> totally useless - why people only half-merge features I've no idea.)
>
> There was a follow-up patch set for this some days ago
> http://comments.gmane.org/gmane.linux.ports.arm.kernel/337686

Yes, I did a bit of digging a while back and found the outstanding
stuff, but it wasn't clear what's happening with it.  As it isn't
part of mainline, and I don't want to pick up further patches to
add dependencies, I decided it was better to stick with old proven
ways of a manually declared platform device for the time being.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.


[Bug 81239] New: Evolution window content not shown fully (only desktop background)

2014-07-11 Thread bugzilla-dae...@freedesktop.org
ON__ = "start_thread"
#10 0xb4b12fee in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
No locals.

Thread 2 (Thread 0x997ffb40 (LWP 25806)):
#0  0xb76fc424 in __kernel_vsyscall ()
No symbol table info available.
#1  0xb742c015 in pthread_cond_timedwait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/pthread_cond_timedwait.S:245
No locals.
#2  0xb4c61a07 in g_cond_wait_until (cond=cond at entry=0xba6ef6a0,
mutex=mutex at entry=0xba6ef698, end_time=5251695126)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gthread-posix.c:898
ts = {tv_sec = 5251, tv_nsec = 695126000}
status = 
#3  0xb4befc51 in g_async_queue_pop_intern_unlocked
(queue=queue at entry=0xba6ef698, wait=wait at entry=1, end_time=5251695126)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gasyncqueue.c:422
retval = 
__FUNCTION__ = "g_async_queue_pop_intern_unlocked"
#4  0xb4bf03d2 in g_async_queue_timeout_pop (queue=0xba6ef698,
timeout=1500)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gasyncqueue.c:543
end_time = 5251695126
retval = 
#5  0xb4c438a5 in g_thread_pool_wait_for_new_pool () at
/build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gthreadpool.c:167
pool = 
local_max_idle_time = 15000
local_wakeup_thread_serial = 
local_max_unused_threads = 2
last_wakeup_thread_serial = 0
have_relayed_thread_marker = 
#6  g_thread_pool_thread_proxy (data=0x9f7c0500) at
/build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gthreadpool.c:364
free_pool = 
task = 0x3a98
pool = 
#7  0xb4c42d4a in g_thread_proxy (data=0x8f9f5750) at
/build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gthread.c:764
thread = 0x8f9f5750
#8  0xb7427efb in start_thread (arg=0x997ffb40) at pthread_create.c:309
__res = 
pd = 0x997ffb40
now = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1220304896, -1719665856,
4001536, -1719668184, -1601835586, -610624030}, 
  mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data =
{prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = 
pagesize_m1 = 
sp = 
freesize = 
__PRETTY_FUNCTION__ = "start_thread"
#9  0xb4b12fee in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:129
No locals.

Thread 1 (Thread 0xb09d7900 (LWP 8284)):
#0  0xb76fc424 in __kernel_vsyscall ()
No symbol table info available.
#1  0xb4b087ab in poll () at ../sysdeps/unix/syscall-template.S:81
No locals.
#2  0xb4c2b38b in poll (__timeout=__timeout at entry=1980, __nfds=__nfds at 
entry=4,
__fds=__fds at entry=0x9f94e500)
at /usr/include/i386-linux-gnu/bits/poll2.h:46
No locals.
#3  g_poll (fds=fds at entry=0x9f94e500, nfds=nfds at entry=4,
timeout=timeout at entry=1980)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gpoll.c:124
No locals.
#4  0xb4c1c518 in g_main_context_poll (priority=2147483647, n_fds=4,
fds=0x9f94e500, timeout=1980, context=0xb8d7f780)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gmain.c:4028
poll_func = 0xb4c2b360 
#5  g_main_context_iterate (context=0xb8d7f780, block=block at entry=1,
dispatch=dispatch at entry=1, self=)
at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gmain.c:3729
max_priority = 2147483647
timeout = 1980
some_ready = 
nfds = 4
allocated_nfds = 
fds = 0x9f94e500
#6  0xb4c1c89b in g_main_loop_run (loop=loop at entry=0x9f3312d8) at
/build/glib2.0-f_gKLq/glib2.0-2.40.0/./glib/gmain.c:3928
__FUNCTION__ = "g_main_loop_run"
#7  0xb506b0ad in gtk_main () at
/build/gtk+3.0-sS5UZs/gtk+3.0-3.12.2/./gtk/gtkmain.c:1192
loop = 0x9f3312d8
#8  0xb7720a9e in main (argc=1, argv=0xbf9c1474) at main.c:680
shell = 0xb8ed70b8
settings = 
error = 0x0

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/1bf6fa3a/attachment-0001.html>


[Bug 79659] R9 270X lockup with unigine valley since radeonsi: enable ARB_sample_shading

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79659

--- Comment #10 from Andy Furniss  ---
(In reply to comment #9)
> The hangs are gone if I apply my workaround which fixes the compile failures.

If you mean -

st/mesa, gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0

I hadn't tried, I assumed they would go in, and now it looks like the stuff in
common has moved up a level.

Checking patch src/gallium/state_trackers/dri/common/dri_context.c...
error: src/gallium/state_trackers/dri/common/dri_context.c: No such file or
directory

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/5c7621cd/attachment-0001.html>


[PATCH 15/83] hsa/radeon: Add interrupt handling module

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:15AM +0300, Oded Gabbay wrote:
> This patch adds the interrupt handling module, in kfd_interrupt.c,
> and its related members in different data structures to the KFD
> driver.
> 
> The KFD interrupt module maintains an internal interrupt ring per kfd
> device. The internal interrupt ring contains interrupts that needs further
> handling.The extra handling is deferred to a later time through a workqueue.
> 
> There's no acknowledgment for the interrupts we use. The hardware simply 
> queues a new interrupt each time without waiting.
> 
> The fixed-size internal queue means that it's possible for us to lose 
> interrupts because we have no back-pressure to the hardware.
> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/hsa/radeon/Makefile|   2 +-
>  drivers/gpu/hsa/radeon/kfd_device.c|   1 +
>  drivers/gpu/hsa/radeon/kfd_interrupt.c | 179 
> +
>  drivers/gpu/hsa/radeon/kfd_priv.h  |  18 
>  drivers/gpu/hsa/radeon/kfd_scheduler.h |   3 +
>  5 files changed, 202 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_interrupt.c
> 
> diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
> index 28da10c..5422e6a 100644
> --- a/drivers/gpu/hsa/radeon/Makefile
> +++ b/drivers/gpu/hsa/radeon/Makefile
> @@ -5,6 +5,6 @@
>  radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \
>   kfd_pasid.o kfd_topology.o kfd_process.o \
>   kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
> - kfd_vidmem.o
> + kfd_vidmem.o kfd_interrupt.o
>  
>  obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o
> diff --git a/drivers/gpu/hsa/radeon/kfd_device.c 
> b/drivers/gpu/hsa/radeon/kfd_device.c
> index 465c822..b2d2861 100644
> --- a/drivers/gpu/hsa/radeon/kfd_device.c
> +++ b/drivers/gpu/hsa/radeon/kfd_device.c
> @@ -30,6 +30,7 @@
>  static const struct kfd_device_info bonaire_device_info = {
>   .scheduler_class = _kfd_cik_static_scheduler_class,
>   .max_pasid_bits = 16,
> + .ih_ring_entry_size = 4 * sizeof(uint32_t)
>  };
>  
>  struct kfd_deviceid {
> diff --git a/drivers/gpu/hsa/radeon/kfd_interrupt.c 
> b/drivers/gpu/hsa/radeon/kfd_interrupt.c
> new file mode 100644
> index 000..2179780
> --- /dev/null
> +++ b/drivers/gpu/hsa/radeon/kfd_interrupt.c
> @@ -0,0 +1,179 @@
> +/*
> + * Copyright 2014 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +/*
> + * KFD Interrupts.
> + *
> + * AMD GPUs deliver interrupts by pushing an interrupt description onto the
> + * interrupt ring and then sending an interrupt. KGD receives the interrupt
> + * in ISR and sends us a pointer to each new entry on the interrupt ring.
> + *
> + * We generally can't process interrupt-signaled events from ISR, so we call
> + * out to each interrupt client module (currently only the scheduler) to ask 
> if
> + * each interrupt is interesting. If they return true, then it requires 
> further
> + * processing so we copy it to an internal interrupt ring and call each
> + * interrupt client again from a work-queue.
> + *
> + * There's no acknowledgment for the interrupts we use. The hardware simply
> + * queues a new interrupt each time without waiting.
> + *
> + * The fixed-size internal queue means that it's possible for us to lose
> + * interrupts because we have no back-pressure to the hardware.
> + */
> +
> +#include 
> +#include 
> +#include "kfd_priv.h"
> +#include "kfd_scheduler.h"
> +
> +#define KFD_INTERRUPT_RING_SIZE 256
> +
> +static void interrupt_wq(struct work_struct *);
> +
> +int
> +radeon_kfd_interrupt_init(struct kfd_dev *kfd)
> +{
> + void *interrupt_ring = kmalloc_array(KFD_INTERRUPT_RING_SIZE,
> + kfd->device_info->ih_ring_entry_size,
> + GFP_KERNEL);
> + if 

[PATCH 2/2] drm/radeon: add user pointer support v4

2014-07-11 Thread Christian König
From: Christian K?nig 

This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.

It imposes several restrictions upon the memory being mapped:

1. It must be page aligned (both start/end addresses, i.e ptr and size).

2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).

3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.

4. The BO is only mapped readonly for now, so no write support.

5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.

Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.

v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate

Signed-off-by: Christian K?nig 
Reviewed-by: Alex Deucher  (v3)
Reviewed-by: J?r?me Glisse  (v3)
---
 drivers/gpu/drm/radeon/radeon.h|   4 ++
 drivers/gpu/drm/radeon/radeon_cs.c |  25 +++-
 drivers/gpu/drm/radeon/radeon_drv.c|   5 +-
 drivers/gpu/drm/radeon/radeon_gem.c|  67 +++
 drivers/gpu/drm/radeon/radeon_kms.c|   1 +
 drivers/gpu/drm/radeon/radeon_object.c |   3 +
 drivers/gpu/drm/radeon/radeon_prime.c  |  10 +++
 drivers/gpu/drm/radeon/radeon_ttm.c| 113 -
 drivers/gpu/drm/radeon/radeon_vm.c |   3 +
 include/uapi/drm/radeon_drm.h  |  11 
 10 files changed, 238 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 8a190ce..ee55b01 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2111,6 +2111,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void 
*data,
  struct drm_file *filp);
 int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
struct drm_file *filp);
+int radeon_gem_import_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *filp);
 int radeon_gem_pin_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data,
@@ -2835,6 +2837,8 @@ extern void radeon_legacy_set_clock_gating(struct 
radeon_device *rdev, int enabl
 extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int 
enable);
 extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 
domain);
 extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
+extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t userptr);
+extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm);
 extern void radeon_vram_location(struct radeon_device *rdev, struct radeon_mc 
*mc, u64 base);
 extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc 
*mc);
 extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool fbcon);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 71a1434..be65311 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
struct radeon_cs_chunk *chunk;
struct radeon_cs_buckets buckets;
unsigned i, j;
-   bool duplicate;
+   bool duplicate, need_mmap_lock = false;
+   int r;

if (p->chunk_relocs_idx == -1) {
return 0;
@@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)
p->relocs[i].allowed_domains = domain;
}

+   if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
+   uint32_t domain = p->relocs[i].prefered_domains;
+   if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
+   DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
+ "allowed for userptr BOs\n");
+   return -EINVAL;
+   }
+   need_mmap_lock = true;
+   domain = RADEON_GEM_DOMAIN_GTT;
+   p->relocs[i].prefered_domains = domain;
+   p->relocs[i].allowed_domains = domain;
+   }
+
p->relocs[i].tv.bo = >relocs[i].robj->tbo;
p->relocs[i].handle = r->handle;

@@ -176,8 +190,15 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)
if (p->cs_flags & RADEON_CS_USE_VM)
p->vm_bos = radeon_vm_get_bos(p->rdev, p->ib.vm,
  >validated);
+   if 

[PATCH 1/2] drm/radeon: add readonly flag to radeon_gart_set_page v3

2014-07-11 Thread Christian König
From: Christian K?nig 

v2: use flag instead of boolean
v3: keep R600_PTE_GART as it is

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/r100.c|  2 +-
 drivers/gpu/drm/radeon/r300.c|  8 ++--
 drivers/gpu/drm/radeon/radeon.h  | 10 ++
 drivers/gpu/drm/radeon/radeon_asic.h |  8 
 drivers/gpu/drm/radeon/radeon_gart.c |  9 +
 drivers/gpu/drm/radeon/radeon_ttm.c  |  4 ++--
 drivers/gpu/drm/radeon/rs400.c   |  9 +++--
 drivers/gpu/drm/radeon/rs600.c   |  8 ++--
 8 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index e32abf3..f58b5d1 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -650,7 +650,7 @@ void r100_pci_gart_disable(struct radeon_device *rdev)
 }

 void r100_pci_gart_set_page(struct radeon_device *rdev, unsigned i,
-   uint64_t addr)
+   uint64_t addr, uint32_t flags)
 {
u32 *gtt = rdev->gart.ptr;
gtt[i] = cpu_to_le32(lower_32_bits(addr));
diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index 8d14e66..b947f42 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -73,13 +73,17 @@ void rv370_pcie_gart_tlb_flush(struct radeon_device *rdev)
 #define R300_PTE_READABLE  (1 << 3)

 void rv370_pcie_gart_set_page(struct radeon_device *rdev, unsigned i,
- uint64_t addr)
+ uint64_t addr, uint32_t flags)
 {
void __iomem *ptr = rdev->gart.ptr;

addr = (lower_32_bits(addr) >> 8) |
   ((upper_32_bits(addr) & 0xff) << 24) |
-  R300_PTE_WRITEABLE | R300_PTE_READABLE;
+  R300_PTE_READABLE;
+
+   if (!(flags & RADEON_GART_PAGE_READONLY))
+   addr |= R300_PTE_WRITEABLE;
+
/* on x86 we want this to be CPU endian, on powerpc
 * on powerpc without HW swappers, it'll get swapped on way
 * into VRAM - so no need for cpu_to_le32 on VRAM tables */
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 7cda75d..8a190ce 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -592,6 +592,8 @@ struct radeon_mc;
 #define RADEON_GPU_PAGE_SHIFT 12
 #define RADEON_GPU_PAGE_ALIGN(a) (((a) + RADEON_GPU_PAGE_MASK) & 
~RADEON_GPU_PAGE_MASK)

+#define RADEON_GART_PAGE_READONLY  1
+
 struct radeon_gart {
dma_addr_t  table_addr;
struct radeon_bo*robj;
@@ -616,7 +618,7 @@ void radeon_gart_unbind(struct radeon_device *rdev, 
unsigned offset,
int pages);
 int radeon_gart_bind(struct radeon_device *rdev, unsigned offset,
 int pages, struct page **pagelist,
-dma_addr_t *dma_addr);
+dma_addr_t *dma_addr, uint32_t flags);


 /*
@@ -855,7 +857,7 @@ struct radeon_mec {

 /* flags used for GART page table entries on R600+ */
 #define R600_PTE_GART  ( R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED \
-   | R600_PTE_READABLE | R600_PTE_WRITEABLE)
+   | R600_PTE_READABLE | R600_PTE_WRITEABLE )

 struct radeon_vm_pt {
struct radeon_bo*bo;
@@ -1775,7 +1777,7 @@ struct radeon_asic {
struct {
void (*tlb_flush)(struct radeon_device *rdev);
void (*set_page)(struct radeon_device *rdev, unsigned i,
-uint64_t addr);
+uint64_t addr, uint32_t flags);
} gart;
struct {
int (*init)(struct radeon_device *rdev);
@@ -2735,7 +2737,7 @@ void radeon_ring_write(struct radeon_ring *ring, uint32_t 
v);
 #define radeon_vga_set_state(rdev, state) (rdev)->asic->vga_set_state((rdev), 
(state))
 #define radeon_asic_reset(rdev) (rdev)->asic->asic_reset((rdev))
 #define radeon_gart_tlb_flush(rdev) (rdev)->asic->gart.tlb_flush((rdev))
-#define radeon_gart_set_page(rdev, i, p) (rdev)->asic->gart.set_page((rdev), 
(i), (p))
+#define radeon_gart_set_page(rdev, i, p, r) 
(rdev)->asic->gart.set_page((rdev), (i), (p), (r))
 #define radeon_asic_vm_init(rdev) (rdev)->asic->vm.init((rdev))
 #define radeon_asic_vm_fini(rdev) (rdev)->asic->vm.fini((rdev))
 #define radeon_asic_vm_set_page(rdev, ib, pe, addr, count, incr, flags) 
((rdev)->asic->vm.set_page((rdev), (ib), (pe), (addr), (count), (incr), 
(flags)))
diff --git a/drivers/gpu/drm/radeon/radeon_asic.h 
b/drivers/gpu/drm/radeon/radeon_asic.h
index 7531b5e..f7d7c33 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.h
+++ b/drivers/gpu/drm/radeon/radeon_asic.h
@@ -68,7 +68,7 @@ int r100_asic_reset(struct radeon_device *rdev);
 u32 r100_get_vblank_counter(struct radeon_device *rdev, int crtc);
 void r100_pci_gart_tlb_flush(struct radeon_device *rdev);
 void 

[Xen-devel] [Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-11 Thread Konrad Rzeszutek Wilk
On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> On Thu, Jul 10, 2014 at 09:08:24PM +, Tian, Kevin wrote:
> > actually I'm curious whether it's still necessary to __detect__ PCH. Could
> > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > code the knowledge:
> > 
> >   } else if (IS_BROADWELL(dev)) {
> >   dev_priv->pch_type = PCH_LPT;
> >   dev_priv->pch_id =
> >   INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> >   DRM_DEBUG_KMS("This is Broadwell, assuming "
> > "LynxPoint LP PCH\n");
> > 
> > Or if there is real usage on non-fixed mapping (not majority), could it be 
> > a 
> > better option to have fixed mapping as a fallback instead of leaving as 
> > PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH,
> > the majority case just works.
> 
> I guess we can do it, at least I haven't seen any strange combinations in
> the wild outside of Intel ...

How big is the QA matrix for this? Would it make sense to just
include the latest hardware (say going two generations back)
and ignore the older one?


[PATCH RFC 06/15] drm/armada: move variant initialisation to CRTC init

2014-07-11 Thread Russell King - ARM Linux
On Sat, Jul 05, 2014 at 01:58:37PM +0200, Sebastian Hesselbarth wrote:
> On 07/05/2014 12:38 PM, Russell King wrote:
> > Move the variant initialisation entirely to the CRTC init function -
> > the variant support is really about the CRTC properties than the whole
> > system, and we want to treat each CRTC individually when we support DT.
> > 
> > Signed-off-by: Russell King 
> > ---
> [...]
> > diff --git a/drivers/gpu/drm/armada/armada_crtc.h 
> > b/drivers/gpu/drm/armada/armada_crtc.h
> > index 531a9b0bdcfb..3f0e70bb2e9c 100644
> > --- a/drivers/gpu/drm/armada/armada_crtc.h
> > +++ b/drivers/gpu/drm/armada/armada_crtc.h
> > @@ -38,6 +38,7 @@ struct armada_crtc {
> > unsignednum;
> > void __iomem*base;
> > struct clk  *clk;
> > +   struct clk  *extclk[2];
> 
> Russell,
> 
> I wonder, if we should rename above array srcclk instead of extclk
> while moving it anyway. That way we can use it for the other variant
> specific clocks, too.

As the patches are prepared with this change, I'd prefer to submit them
as-is, and then we can update that as and when the support for things
like the MMP/610 is finished off.  I think they're good to go, so I'll
send them off later today to David.

This leaves the TDA998x componentisation patches which I need to kick
out, and the initial DT changes.  Once those are in place, we should
have almost all ducks lined up for working DRM support - it'll certainly
be advanced enough to describe the LCD controllers and the TDA998x as
three separate DT entities using the of graph helpers.

What's left is the display-subsystem { } entity to describe the makeup
of the subsystem.  That's not included as we currently need to pass
a block of memory, and the DT support for reserving chunks of memory
appeared (last time I looked) to only be botch-merged (only half of it
seems to have been merged making the whole reserved memory thing
totally useless - why people only half-merge features I've no idea.)

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 06:56:12PM +, Bridgman, John wrote:
> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >Sent: Friday, July 11, 2014 2:52 PM
> >To: Bridgman, John
> >Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
> >kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky, 
> >Andrew;
> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon
> >Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada;
> >Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
> >AMD's GPUs
> >
> >On Fri, Jul 11, 2014 at 06:46:30PM +, Bridgman, John wrote:
> >> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >> >Sent: Friday, July 11, 2014 2:11 PM
> >> >To: Bridgman, John
> >> >Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
> >> >kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky,
> >> >Andrew; Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J.
> >> >Wysocki; Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke;
> >> >Srinivas Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
> >> >Philipp Zabel
> >> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
> >> >for AMD's GPUs
> >> >
> >> >On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
> >> >> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >> >> >Sent: Friday, July 11, 2014 1:04 PM
> >> >> >To: Oded Gabbay
> >> >> >Cc: David Airlie; Deucher, Alexander;
> >> >> >linux-kernel at vger.kernel.org;
> >> >> >dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
> >> >> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki;
> >> >> >Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas
> >> >> >Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach;
> >> >> >Philipp Zabel
> >> >> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
> >> >> >for AMD's GPUs
> >> >> >
> >> >> >On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
> >> >> >> This patch adds the code base of the hsa driver for AMD's GPUs.
> >> >> >>
> >> >> >> This driver is called kfd.
> >> >> >>
> >> >> >> This initial version supports the first HSA chip, Kaveri.
> >> >> >>
> >> >> >> This driver is located in a new directory structure under 
> >> >> >> drivers/gpu.
> >> >> >>
> >> >> >> Signed-off-by: Oded Gabbay 
> >> >> >
> >> >> >There is too coding style issues. While we have been lax on the
> >> >> >enforcing the scripts/checkpatch.pl rules i think there is a limit
> >> >> >to that. I am not strict on the 80chars per line but others things
> >> >> >needs fixing
> >> >so we stay inline.
> >> >> >
> >> >> >Also i am a bit worried about the license, given top comment in
> >> >> >each of the files i am not sure this is GPL2 compatible. I would
> >> >> >need to ask lawyer to review that.
> >> >> >
> >> >>
> >> >> Hi Jerome,
> >> >>
> >> >> Which line in the license are you concerned about ? In theory we're
> >> >> using
> >> >the same license as the initial code pushes for radeon, and I just
> >> >did a side-by side compare with the license header on cik.c in the
> >> >radeon tree and confirmed that the two licenses are identical.
> >> >>
> >> >> The cik.c header has an additional "Authors:" line which the kfd
> >> >> files do
> >> >not, but AFAIK that is not part of the license text proper.
> >> >>
> >> >
> >> >You can not claim GPL if you want to use this license. radeon is
> >> >weird best for historical reasons as we wanted to share code with BSD
> >> >thus it is dual licensed and this is reflected with :
> >> >MODULE_LICENSE("GPL and additional rights");
> >> >
> >> >inside radeon_drv.c
> >> >
> >> >So if you want to have MODULE_LICENSE(GPL) then you should have
> >> >header that use the GPL license wording and no wording from BSD like
> >license.
> >> >Otherwise change the MODULE_LICENSE and it would also be good to say
> >> >dual licensed at top of each files (or least next to each license) so
> >> >that it is clear this is BSD & GPL license.
> >>
> >> Got it. Missed that we had a different MODULE_LICENSE.
> >>
> >> Since the goal is license compatibility with radeon so we can update the
> >interface and move code between the drivers in future I guess my
> >preference would be to update MODULE_LICENSE in the kfd code to "GPL and
> >additional rights", do you think that would be OK ?
> >
> >I am not a lawyer and nothing that i said should be considered as legal 
> >advice
> >(on the contrary ;)) I think you need to be more clear with each license to
> >clear says GPLv2 or BSD ie dual licensed but the dual license is a beast you
> >would definitly want to talk to lawyer about.
> 
> Yeah, dual license seems horrid in its implications for developers so we've 
> always tried to avoid it. GPL hurts us for porting to other OSes so the X11 / 
> "GPL with additional rights" combo seemed like the ideal solution and 

[PATCH 13/83] hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:13AM +0300, Oded Gabbay wrote:
> This patch adds 2 new IOCTL to kfd driver.
> 
> The first IOCTL is KFD_IOC_CREATE_QUEUE that is used by the user-mode
> application to create a compute queue on the GPU.
> 
> The second IOCTL is KFD_IOC_DESTROY_QUEUE that is used by the
> user-mode application to destroy an existing compute queue on the GPU.
> 
> Signed-off-by: Oded Gabbay 

Coding style need fixing. What is the percent argument ? What is it use
for ?

You need to check range validity of argument provided by userspace. Rules
is never trust userspace. Especialy for things like queue_size which is
use without never being check allowing userspace to send 0 which leads
to broken queue size.

Also out of curiosity what kind of event happens if userspace munmap its
ring buffer before unregistering a queue ?

> ---
>  drivers/gpu/hsa/radeon/kfd_chardev.c  | 155 
> ++
>  drivers/gpu/hsa/radeon/kfd_doorbell.c |  11 +++
>  include/uapi/linux/kfd_ioctl.h|  69 +++

Again better to create an hsa directory for kfd_ioctl.h

>  3 files changed, 235 insertions(+)
>  create mode 100644 include/uapi/linux/kfd_ioctl.h
> 
> diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
> b/drivers/gpu/hsa/radeon/kfd_chardev.c
> index 0b5bc74..4e7d5d0 100644
> --- a/drivers/gpu/hsa/radeon/kfd_chardev.c
> +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
> @@ -27,11 +27,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "kfd_priv.h"
>  #include "kfd_scheduler.h"
>  
>  static long kfd_ioctl(struct file *, unsigned int, unsigned long);
>  static int kfd_open(struct inode *, struct file *);
> +static int kfd_mmap(struct file *, struct vm_area_struct *);
>  
>  static const char kfd_dev_name[] = "kfd";
>  
> @@ -108,17 +110,170 @@ kfd_open(struct inode *inode, struct file *filep)
>   return 0;
>  }
>  
> +static long
> +kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, void 
> __user *arg)
> +{
> + struct kfd_ioctl_create_queue_args args;
> + struct kfd_dev *dev;
> + int err = 0;
> + unsigned int queue_id;
> + struct kfd_queue *queue;
> + struct kfd_process_device *pdd;
> +
> + if (copy_from_user(, arg, sizeof(args)))
> + return -EFAULT;
> +
> + dev = radeon_kfd_device_by_id(args.gpu_id);
> + if (dev == NULL)
> + return -EINVAL;
> +
> + queue = kzalloc(
> + offsetof(struct kfd_queue, scheduler_queue) + 
> dev->device_info->scheduler_class->queue_size,
> + GFP_KERNEL);
> +
> + if (!queue)
> + return -ENOMEM;
> +
> + queue->dev = dev;
> +
> + mutex_lock(>mutex);
> +
> + pdd = radeon_kfd_bind_process_to_device(dev, p);
> + if (IS_ERR(pdd) < 0) {
> + err = PTR_ERR(pdd);
> + goto err_bind_pasid;
> + }
> +
> + pr_debug("kfd: creating queue number %d for PASID %d on GPU 0x%x\n",
> + pdd->queue_count,
> + p->pasid,
> + dev->id);
> +
> + if (pdd->queue_count++ == 0) {
> + err = 
> dev->device_info->scheduler_class->register_process(dev->scheduler, p, 
> >scheduler_process);
> + if (err < 0)
> + goto err_register_process;
> + }
> +
> + if (!radeon_kfd_allocate_queue_id(p, _id))
> + goto err_allocate_queue_id;
> +
> + err = dev->device_info->scheduler_class->create_queue(dev->scheduler, 
> pdd->scheduler_process,
> +   
> >scheduler_queue,
> +   (void __user 
> *)args.ring_base_address,
> +   args.ring_size,
> +   (void __user 
> *)args.read_pointer_address,
> +   (void __user 
> *)args.write_pointer_address,
> +   
> radeon_kfd_queue_id_to_doorbell(dev, p, queue_id));
> + if (err)
> + goto err_create_queue;
> +
> + radeon_kfd_install_queue(p, queue_id, queue);
> +
> + args.queue_id = queue_id;
> + args.doorbell_address = 
> (uint64_t)(uintptr_t)radeon_kfd_get_doorbell(filep, p, dev, queue_id);
> +
> + if (copy_to_user(arg, , sizeof(args))) {
> + err = -EFAULT;
> + goto err_copy_args_out;
> + }
> +
> + mutex_unlock(>mutex);
> +
> + pr_debug("kfd: queue id %d was created successfully.\n"
> +  " ring buffer address == 0x%016llX\n"
> +  " read ptr address== 0x%016llX\n"
> +  " write ptr address   == 0x%016llX\n"
> +  " doorbell address== 0x%016llX\n",
> + args.queue_id,
> + args.ring_base_address,
> + args.read_pointer_address,
> +   

[PATCH v3 4/4] ARM: tegra: roth: add display DT node

2014-07-11 Thread Thierry Reding
On Tue, Jul 08, 2014 at 09:32:14PM +0900, Alexandre Courbot wrote:
> Tegra DSI support has been fixed to support continuous clock behavior that
> the panel used on SHIELD requires, so finally add its device tree node
> since it is functional.
> 
> Signed-off-by: Alexandre Courbot 
> ---
>  arch/arm/boot/dts/tegra114-roth.dts | 22 +++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

I've applied this to Tegra's for-3.17/dt branch. Thanks.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/63aa3843/attachment-0001.sig>


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 06:46:30PM +, Bridgman, John wrote:
> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >Sent: Friday, July 11, 2014 2:11 PM
> >To: Bridgman, John
> >Cc: Oded Gabbay; David Airlie; Deucher, Alexander; linux-
> >kernel at vger.kernel.org; dri-devel at lists.freedesktop.org; Lewycky, 
> >Andrew;
> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon
> >Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada;
> >Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
> >AMD's GPUs
> >
> >On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
> >> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >> >Sent: Friday, July 11, 2014 1:04 PM
> >> >To: Oded Gabbay
> >> >Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org;
> >> >dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
> >> >Joerg Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki;
> >> >Kishon Vijay Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas
> >> >Pandruvada; Santosh Shilimkar; Andreas Noever; Lucas Stach; Philipp
> >> >Zabel
> >> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver
> >> >for AMD's GPUs
> >> >
> >> >On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
> >> >> This patch adds the code base of the hsa driver for AMD's GPUs.
> >> >>
> >> >> This driver is called kfd.
> >> >>
> >> >> This initial version supports the first HSA chip, Kaveri.
> >> >>
> >> >> This driver is located in a new directory structure under drivers/gpu.
> >> >>
> >> >> Signed-off-by: Oded Gabbay 
> >> >
> >> >There is too coding style issues. While we have been lax on the
> >> >enforcing the scripts/checkpatch.pl rules i think there is a limit to
> >> >that. I am not strict on the 80chars per line but others things needs 
> >> >fixing
> >so we stay inline.
> >> >
> >> >Also i am a bit worried about the license, given top comment in each
> >> >of the files i am not sure this is GPL2 compatible. I would need to
> >> >ask lawyer to review that.
> >> >
> >>
> >> Hi Jerome,
> >>
> >> Which line in the license are you concerned about ? In theory we're using
> >the same license as the initial code pushes for radeon, and I just did a 
> >side-by
> >side compare with the license header on cik.c in the radeon tree and
> >confirmed that the two licenses are identical.
> >>
> >> The cik.c header has an additional "Authors:" line which the kfd files do
> >not, but AFAIK that is not part of the license text proper.
> >>
> >
> >You can not claim GPL if you want to use this license. radeon is weird best 
> >for
> >historical reasons as we wanted to share code with BSD thus it is dual
> >licensed and this is reflected with :
> >MODULE_LICENSE("GPL and additional rights");
> >
> >inside radeon_drv.c
> >
> >So if you want to have MODULE_LICENSE(GPL) then you should have header
> >that use the GPL license wording and no wording from BSD like license.
> >Otherwise change the MODULE_LICENSE and it would also be good to say
> >dual licensed at top of each files (or least next to each license) so that 
> >it is
> >clear this is BSD & GPL license.
> 
> Got it. Missed that we had a different MODULE_LICENSE.
> 
> Since the goal is license compatibility with radeon so we can update the 
> interface and move code between the drivers in future I guess my preference 
> would be to update MODULE_LICENSE in the kfd code to "GPL and additional 
> rights", do you think that would be OK ?

I am not a lawyer and nothing that i said should be considered as legal
advice (on the contrary ;)) I think you need to be more clear with each
license to clear says GPLv2 or BSD ie dual licensed but the dual license
is a beast you would definitly want to talk to lawyer about.

Cheers,
J?r?me


[Bug 79659] R9 270X lockup with unigine valley since radeonsi: enable ARB_sample_shading

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79659

--- Comment #9 from Marek Ol??k  ---
The hangs are gone if I apply my workaround which fixes the compile failures.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/8561933c/attachment.html>


[PATCH 12/83] hsa/radeon: Add kfd mmap handler

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:12AM +0300, Oded Gabbay wrote:
> This patch adds the kfd mmap handler that maps the physical address
> of a doorbell page to a user-space virtual address. That virtual address
> belongs to the process that uses the doorbell page.
> 
> This mmap handler is called only from within the kernel and not to be
> called from user-mode mmap of /dev/kfd.

I think you need to modify max doorbell to be function of page size.
You definitly want to forbid any access to other process doorbell and
you can only map page with PAGE_SIZE granularity hence you need to
modulate the max number of doorbell depending on page size and not
assume page size is 4k on x86. Someone might build a kernel with
different page size and if it wants to use this driver it will open
several security issues.

Cheers,
J?r?me

> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/hsa/radeon/kfd_chardev.c  | 20 +
>  drivers/gpu/hsa/radeon/kfd_doorbell.c | 85 
> +++
>  2 files changed, 105 insertions(+)
> 
> diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
> b/drivers/gpu/hsa/radeon/kfd_chardev.c
> index 7a56a8f..0b5bc74 100644
> --- a/drivers/gpu/hsa/radeon/kfd_chardev.c
> +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
> @@ -39,6 +39,7 @@ static const struct file_operations kfd_fops = {
>   .owner = THIS_MODULE,
>   .unlocked_ioctl = kfd_ioctl,
>   .open = kfd_open,
> + .mmap = kfd_mmap,
>  };
>  
>  static int kfd_char_dev_major = -1;
> @@ -131,3 +132,22 @@ kfd_ioctl(struct file *filep, unsigned int cmd, unsigned 
> long arg)
>  
>   return err;
>  }
> +
> +static int
> +kfd_mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> + unsigned long pgoff = vma->vm_pgoff;
> + struct kfd_process *process;
> +
> + process = radeon_kfd_get_process(current);
> + if (IS_ERR(process))
> + return PTR_ERR(process);
> +
> + if (pgoff < KFD_MMAP_DOORBELL_START)
> + return -EINVAL;
> +
> + if (pgoff < KFD_MMAP_DOORBELL_END)
> + return radeon_kfd_doorbell_mmap(process, vma);
> +
> + return -EINVAL;
> +}
> diff --git a/drivers/gpu/hsa/radeon/kfd_doorbell.c 
> b/drivers/gpu/hsa/radeon/kfd_doorbell.c
> index 79a9d4b..e1d8506 100644
> --- a/drivers/gpu/hsa/radeon/kfd_doorbell.c
> +++ b/drivers/gpu/hsa/radeon/kfd_doorbell.c
> @@ -70,3 +70,88 @@ void radeon_kfd_doorbell_init(struct kfd_dev *kfd)
>   kfd->doorbell_process_limit = doorbell_process_limit;
>  }
>  
> +/* This is the /dev/kfd mmap (for doorbell) implementation. We intend that 
> this is only called through map_doorbells,
> +** not through user-mode mmap of /dev/kfd. */
> +int radeon_kfd_doorbell_mmap(struct kfd_process *process, struct 
> vm_area_struct *vma)
> +{
> + unsigned int device_index;
> + struct kfd_dev *dev;
> + phys_addr_t start;
> +
> + BUG_ON(vma->vm_pgoff < KFD_MMAP_DOORBELL_START || vma->vm_pgoff >= 
> KFD_MMAP_DOORBELL_END);
> +
> + /* For simplicitly we only allow mapping of the entire doorbell 
> allocation of a single device & process. */
> + if (vma->vm_end - vma->vm_start != doorbell_process_allocation())
> + return -EINVAL;
> +
> + /* device_index must be GPU ID!! */
> + device_index = vma->vm_pgoff - KFD_MMAP_DOORBELL_START;
> +
> + dev = radeon_kfd_device_by_id(device_index);
> + if (dev == NULL)
> + return -EINVAL;
> +
> + vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | 
> VM_DONTDUMP | VM_PFNMAP;
> + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +
> + start = dev->doorbell_base + process->pasid * 
> doorbell_process_allocation();
> +
> + pr_debug("kfd: mapping doorbell page in radeon_kfd_doorbell_mmap\n"
> +  " target user address == 0x%016llX\n"
> +  " physical address== 0x%016llX\n"
> +  " vm_flags== 0x%08lX\n"
> +  " size== 0x%08lX\n",
> +  (long long unsigned int) vma->vm_start, start, vma->vm_flags,
> +  doorbell_process_allocation());
> +
> + return io_remap_pfn_range(vma,
> + vma->vm_start,
> + start >> PAGE_SHIFT,
> + doorbell_process_allocation(),
> + vma->vm_page_prot);
> +}
> +
> +/* Map the doorbells for a single process & device. This will indirectly 
> call radeon_kfd_doorbell_mmap.
> +** This assumes that the process mutex is being held. */
> +static int
> +map_doorbells(struct file *devkfd, struct kfd_process *process, struct 
> kfd_dev *dev)
> +{
> + struct kfd_process_device *pdd = 
> radeon_kfd_get_process_device_data(dev, process);
> +
> + if (pdd == NULL)
> + return -ENOMEM;
> +
> + if (pdd->doorbell_mapping == NULL) {
> + unsigned long offset = (KFD_MMAP_DOORBELL_START + dev->id) << 
> PAGE_SHIFT;
> + doorbell_t 

[PATCH 8/8] drm/tilcdc: panel: Add support for enable GPIO

2014-07-11 Thread Ezequiel Garcia
Hello Fabio,

On 11 Jul 12:08 PM, Fabio Estevam wrote:
> On Fri, Jul 11, 2014 at 11:18 AM, Ezequiel Garcia
>  wrote:
> > In order to support the "enable GPIO" available in many panel devices,
> > this commit adds a proper devicetree binding.
> >
> > By providing an enable GPIO in the devicetree, the driver can now turn
> > off and on the panel device, and/or the backlight device. Both the
> > backlight and the GPIO are optional properties.
> > +   panel_mod->enable_gpio = devm_gpiod_get(>dev, "enable");
> > +   if (IS_ERR(panel_mod->enable_gpio)) {
> > +   ret = PTR_ERR(panel_mod->enable_gpio);
> > +   if (ret != -ENOENT) {
> 
> Shouldn't this be controlled by a regulator instead? What if the panel
> is powered from a PMIC output?

I'm not sure I understand how is that related. I have a New Heaven LCD panel
(NHD-4.3-480272EF-ATXL#-T) and it has a signal called "Display On/Off" that
I'm using to enable and disable the panel from a GPIO.

This is useful when switching the output from the panel to the HDMI for
instance, and turn off the display panel when the output goes to the HDMI.

Probably I'm missing something, I can't really see how regulators fit here.

Thanks!
-- 
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com


[PATCH 11/83] hsa/radeon: Add scheduler code

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:11AM +0300, Oded Gabbay wrote:
> This patch adds the code base of the scheduler, which handles queue
> creation, deletion and scheduling on the CP of the GPU.
> 
> Signed-off-by: Oded Gabbay 

I would rather see all this squashed, this gave feeling that driver
can access register which is latter remove. I know jungling with
patch squashing can be daunting but really it makes reviewing hard
here because i have to jump back and forth to see if thing i am looking
at really matter in the final version.

Cheers,
J?r?me

> ---
>  drivers/gpu/hsa/radeon/Makefile   |   3 +-
>  drivers/gpu/hsa/radeon/cik_regs.h | 213 +++
>  drivers/gpu/hsa/radeon/kfd_device.c   |   1 +
>  drivers/gpu/hsa/radeon/kfd_registers.c|  50 ++
>  drivers/gpu/hsa/radeon/kfd_sched_cik_static.c | 800 
> ++
>  drivers/gpu/hsa/radeon/kfd_vidmem.c   |  61 ++
>  6 files changed, 1127 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/hsa/radeon/cik_regs.h
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_registers.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_sched_cik_static.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_vidmem.c
> 
> diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
> index 989518a..28da10c 100644
> --- a/drivers/gpu/hsa/radeon/Makefile
> +++ b/drivers/gpu/hsa/radeon/Makefile
> @@ -4,6 +4,7 @@
>  
>  radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \
>   kfd_pasid.o kfd_topology.o kfd_process.o \
> - kfd_doorbell.o
> + kfd_doorbell.o kfd_sched_cik_static.o kfd_registers.o \
> + kfd_vidmem.o
>  
>  obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o
> diff --git a/drivers/gpu/hsa/radeon/cik_regs.h 
> b/drivers/gpu/hsa/radeon/cik_regs.h
> new file mode 100644
> index 000..d0cdc57
> --- /dev/null
> +++ b/drivers/gpu/hsa/radeon/cik_regs.h
> @@ -0,0 +1,213 @@
> +/*
> + * Copyright 2014 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef CIK_REGS_H
> +#define CIK_REGS_H
> +
> +#define BIF_DOORBELL_CNTL0x530Cu
> +
> +#define  SRBM_GFX_CNTL   0xE44
> +#define  PIPEID(x)   ((x) << 0)
> +#define  MEID(x) ((x) << 2)
> +#define  VMID(x) ((x) << 4)
> +#define  QUEUEID(x)  ((x) << 8)
> +
> +#define  SQ_CONFIG   0x8C00
> +
> +#define  SH_MEM_BASES0x8C28
> +/* if PTR32, these are the bases for scratch and lds */
> +#define  PRIVATE_BASE(x) ((x) << 0) /* 
> scratch */
> +#define  SHARED_BASE(x)  ((x) << 16) /* 
> LDS */
> +#define  SH_MEM_APE1_BASE0x8C2C
> +/* if PTR32, this is the base location of GPUVM */
> +#define  SH_MEM_APE1_LIMIT   0x8C30
> +/* if PTR32, this is the upper limit of GPUVM */
> +#define  SH_MEM_CONFIG   0x8C34
> +#define  PTR32   (1 << 0)
> +#define  ALIGNMENT_MODE(x)   ((x) << 2)
> +#define  SH_MEM_ALIGNMENT_MODE_DWORD 0
> +#define  SH_MEM_ALIGNMENT_MODE_DWORD_STRICT  1
> +#define  SH_MEM_ALIGNMENT_MODE_STRICT2
> +#define  SH_MEM_ALIGNMENT_MODE_UNALIGNED 3
> +#define  DEFAULT_MTYPE(x)((x) << 4)
> +#define  APE1_MTYPE(x)   ((x) << 7)
> +
> +/* valid for both DEFAULT_MTYPE and APE1_MTYPE */
> +#define 

[RESEND PATCH v3 06/11] drm: add DT bindings documentation for atmel-hlcdc-dc driver

2014-07-11 Thread Boris BREZILLON
On Fri, 11 Jul 2014 14:00:25 +0200
Boris BREZILLON  wrote:

> On Fri, 11 Jul 2014 12:37:46 +0200
> Laurent Pinchart  wrote:
> 
> > Hi Boris,
> > 
> > On Thursday 10 July 2014 14:56:26 Boris BREZILLON wrote:
> > > On Thu, 10 Jul 2014 13:16:21 +0200 Laurent Pinchart wrote:
> > > > On Monday 07 July 2014 18:42:59 Boris BREZILLON wrote:
> > > > > The Atmel HLCDC (HLCD Controller) IP available on some Atmel SoCs 
> > > > > (i.e.
> > > > > at91sam9n12, at91sam9x5 family or sama5d3 family) provides a display
> > > > > controller device.
> > > > > 
> > > > > The HLCDC block provides a single RGB output port, and only supports 
> > > > > LCD
> > > > > panels connection to LCD panels for now.
> > > > > 
> > > > > The atmel,panel property link the HLCDC RGB output with the LCD panel
> > > > > connected on this port (note that the HLCDC RGB connector 
> > > > > implementation
> > > > > makes use of the DRM panel framework).
> > > > > 
> > > > > Connection to other external devices (DRM bridges) might be added 
> > > > > later
> > > > > by mean of a new atmel,xxx (atmel,bridge) property.
> > > > > 
> > > > > Signed-off-by: Boris BREZILLON 
> > > > > ---
> > > > > 
> > > > >  .../devicetree/bindings/drm/atmel-hlcdc-dc.txt | 59 
> > > > > +++
> > > > >  1 file changed, 59 insertions(+)
> > > > >  create mode 100644
> > > > >  Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > > 
> > > > > diff --git a/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > > b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt new file 
> > > > > mode
> > > > > 100644
> > > > > index 000..594bdb2
> > > > > --- /dev/null
> > > > > +++ b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > > @@ -0,0 +1,59 @@
> > > > > +Device-Tree bindings for Atmel's HLCDC (High LCD Controller) DRM 
> > > > > driver
> > > > > +
> > > > > +The Atmel HLCDC Display Controller is subdevice of the HLCDC MFD
> > > > > device.
> > > > > +See Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt for more
> > > > > details.
> > > > > +
> > > > > +Required properties:
> > > > > + - compatible: value should be one of the following:
> > > > > +   "atmel,hlcdc-dc"
> > > > > + - interrupts: the HLCDC interrupt definition
> > > > > + - pinctrl-names: the pin control state names. Should contain
> > > > > "default",
> > > > > +   "rgb-444", "rgb-565", "rgb-666" and "rgb-888".
> > > > > + - pinctrl-[0-4]: should contain the pinctrl states described by
> > > > > pinctrl
> > > > > +   names.
> > > > 
> > > > Do you need to switch between the different pinctrl configurations at
> > > > runtime, or is the configuration selected from the panel type, which
> > > > doesn't change ?
> > >
> > > At the moment no, but if we ever need to support different devices on
> > > the same RGB connector (actually Atmel's sama5d3xek boards have an
> > > RGB to HDMI bridge connected on the same RGB connector) and these
> > > devices do not support the same RGB mode (say your LCD panel supports
> > > RGB888 and your RGB to HDMI bridge supports RGB555), then depending on
> > > the output you select you'll have to change your pinctrl config at
> > > runtime.
> > 
> > Just to make sure I understand the use case correctly, are you talking 
> > about 
> > two devices (for example an RGB666 panel and an RGB888 RGB to HDMI bridge) 
> > connected to the same output, with the ability to switch between the two at 
> > runtime ?
> 
> Exactly.
> 
> > That's a valid case (on a side note we shouldn't forget that the 
> > option of using both devices at the same time should be supported as well), 
> 
> AFAICT this is only possible if both devices connected to the RGB
> connector use the same mode.
> 
> > but I would probably go for a fixed pinctrl configuration that supports 
> > both, 
> > although switching configurations at runtime would be a micro-optimization 
> > that might make sense.
> 
> Yep, it should work, and I agree that we're unlikely to reuse some RGB
> pins for other usage when the active device is the one using RGB666
> mode.
> 
> > 
> > > I'd say we could get rid of this runtime pinctrl config as a first step
> > > if DT ABI stability was not required.
> > > But it is, and I'd like to have a future proof binding to handle these
> > > tricky cases when they occurs (if they ever do).
> > 
> > I think we have a shortcoming of the pinctrl API here in the general case. 
> > The 
> > API only allows you to select a single configuration per device. Imagine 
> > the 
> > same display controller, with two DPI outputs, each of them configurable in 
> > 444, 565, 666 or 888 modes. With the current API we would have to create 
> > 4*4 = 
> > 16 pinctrl configurations for all combinations. That obviously wouldn't 
> > scale, 
> > so we'll have to fix this eventually. From a DT stability point of view, I 
> > would thus avoid specifying multiple pinctrl configurations now until we 
> > come 
> > up with a standard way to support this use case.
> 

[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 06:02:39PM +, Bridgman, John wrote:
> >From: Jerome Glisse [mailto:j.glisse at gmail.com]
> >Sent: Friday, July 11, 2014 1:04 PM
> >To: Oded Gabbay
> >Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org; dri-
> >devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew; Joerg
> >Roedel; Gabbay, Oded; Greg Kroah-Hartman; Rafael J. Wysocki; Kishon Vijay
> >Abraham I; Sandeep Nair; Kenneth Heitke; Srinivas Pandruvada; Santosh
> >Shilimkar; Andreas Noever; Lucas Stach; Philipp Zabel
> >Subject: Re: [PATCH 09/83] hsa/radeon: Add code base of hsa driver for
> >AMD's GPUs
> >
> >On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
> >> This patch adds the code base of the hsa driver for
> >> AMD's GPUs.
> >>
> >> This driver is called kfd.
> >>
> >> This initial version supports the first HSA chip, Kaveri.
> >>
> >> This driver is located in a new directory structure under drivers/gpu.
> >>
> >> Signed-off-by: Oded Gabbay 
> >
> >There is too coding style issues. While we have been lax on the enforcing the
> >scripts/checkpatch.pl rules i think there is a limit to that. I am not strict
> >on the 80chars per line but others things needs fixing so we stay inline.
> >
> >Also i am a bit worried about the license, given top comment in each of the
> >files i am not sure this is GPL2 compatible. I would need to ask lawyer to
> >review that.
> >
> 
> Hi Jerome,
> 
> Which line in the license are you concerned about ? In theory we're using the 
> same license as the initial code pushes for radeon, and I just did a side-by 
> side compare with the license header on cik.c in the radeon tree and 
> confirmed that the two licenses are identical. 
> 
> The cik.c header has an additional "Authors:" line which the kfd files do 
> not, but AFAIK that is not part of the license text proper.
> 

You can not claim GPL if you want to use this license. radeon is weird
best for historical reasons as we wanted to share code with BSD thus it
is dual licensed and this is reflected with :
MODULE_LICENSE("GPL and additional rights");

inside radeon_drv.c

So if you want to have MODULE_LICENSE(GPL) then you should have header
that use the GPL license wording and no wording from BSD like license.
Otherwise change the MODULE_LICENSE and it would also be good to say
dual licensed at top of each files (or least next to each license) so
that it is clear this is BSD & GPL license.

Cheers,
J?r?me


[RESEND PATCH v3 06/11] drm: add DT bindings documentation for atmel-hlcdc-dc driver

2014-07-11 Thread Boris BREZILLON
On Fri, 11 Jul 2014 12:37:46 +0200
Laurent Pinchart  wrote:

> Hi Boris,
> 
> On Thursday 10 July 2014 14:56:26 Boris BREZILLON wrote:
> > On Thu, 10 Jul 2014 13:16:21 +0200 Laurent Pinchart wrote:
> > > On Monday 07 July 2014 18:42:59 Boris BREZILLON wrote:
> > > > The Atmel HLCDC (HLCD Controller) IP available on some Atmel SoCs (i.e.
> > > > at91sam9n12, at91sam9x5 family or sama5d3 family) provides a display
> > > > controller device.
> > > > 
> > > > The HLCDC block provides a single RGB output port, and only supports LCD
> > > > panels connection to LCD panels for now.
> > > > 
> > > > The atmel,panel property link the HLCDC RGB output with the LCD panel
> > > > connected on this port (note that the HLCDC RGB connector implementation
> > > > makes use of the DRM panel framework).
> > > > 
> > > > Connection to other external devices (DRM bridges) might be added later
> > > > by mean of a new atmel,xxx (atmel,bridge) property.
> > > > 
> > > > Signed-off-by: Boris BREZILLON 
> > > > ---
> > > > 
> > > >  .../devicetree/bindings/drm/atmel-hlcdc-dc.txt | 59 +++
> > > >  1 file changed, 59 insertions(+)
> > > >  create mode 100644
> > > >  Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt new file mode
> > > > 100644
> > > > index 000..594bdb2
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > > @@ -0,0 +1,59 @@
> > > > +Device-Tree bindings for Atmel's HLCDC (High LCD Controller) DRM driver
> > > > +
> > > > +The Atmel HLCDC Display Controller is subdevice of the HLCDC MFD
> > > > device.
> > > > +See Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt for more
> > > > details.
> > > > +
> > > > +Required properties:
> > > > + - compatible: value should be one of the following:
> > > > +   "atmel,hlcdc-dc"
> > > > + - interrupts: the HLCDC interrupt definition
> > > > + - pinctrl-names: the pin control state names. Should contain
> > > > "default",
> > > > +   "rgb-444", "rgb-565", "rgb-666" and "rgb-888".
> > > > + - pinctrl-[0-4]: should contain the pinctrl states described by
> > > > pinctrl
> > > > +   names.
> > > 
> > > Do you need to switch between the different pinctrl configurations at
> > > runtime, or is the configuration selected from the panel type, which
> > > doesn't change ?
> >
> > At the moment no, but if we ever need to support different devices on
> > the same RGB connector (actually Atmel's sama5d3xek boards have an
> > RGB to HDMI bridge connected on the same RGB connector) and these
> > devices do not support the same RGB mode (say your LCD panel supports
> > RGB888 and your RGB to HDMI bridge supports RGB555), then depending on
> > the output you select you'll have to change your pinctrl config at
> > runtime.
> 
> Just to make sure I understand the use case correctly, are you talking about 
> two devices (for example an RGB666 panel and an RGB888 RGB to HDMI bridge) 
> connected to the same output, with the ability to switch between the two at 
> runtime ?

Exactly.

> That's a valid case (on a side note we shouldn't forget that the 
> option of using both devices at the same time should be supported as well), 

AFAICT this is only possible if both devices connected to the RGB
connector use the same mode.

> but I would probably go for a fixed pinctrl configuration that supports both, 
> although switching configurations at runtime would be a micro-optimization 
> that might make sense.

Yep, it should work, and I agree that we're unlikely to reuse some RGB
pins for other usage when the active device is the one using RGB666
mode.

> 
> > I'd say we could get rid of this runtime pinctrl config as a first step
> > if DT ABI stability was not required.
> > But it is, and I'd like to have a future proof binding to handle these
> > tricky cases when they occurs (if they ever do).
> 
> I think we have a shortcoming of the pinctrl API here in the general case. 
> The 
> API only allows you to select a single configuration per device. Imagine the 
> same display controller, with two DPI outputs, each of them configurable in 
> 444, 565, 666 or 888 modes. With the current API we would have to create 4*4 
> = 
> 16 pinctrl configurations for all combinations. That obviously wouldn't 
> scale, 
> so we'll have to fix this eventually. From a DT stability point of view, I 
> would thus avoid specifying multiple pinctrl configurations now until we come 
> up with a standard way to support this use case.

Given your inputs, I guess I'll drop dynamic pinctrl config for the
next version.

> 
> > Anyway, I'm open to any other alternative that could let me add support
> > for this later on.
> > 
> > BTW, is there any reason for not defining an RGB connector type (I'm
> > currently defining HLCDC connector as an LVDS connector) ?
> 
> Not that I know of. The DRM API has 

[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Peter De Schrijver
On Fri, Jul 11, 2014 at 04:01:02AM +0200, Ben Skeggs wrote:
> On Fri, Jul 11, 2014 at 11:49 AM, Alexandre Courbot  
> wrote:
> > On 07/10/2014 06:43 PM, Peter De Schrijver wrote:
> >>
> >> On Thu, Jul 10, 2014 at 09:34:34AM +0200, Alexandre Courbot wrote:
> >>>
> >>> This series adds support for reclocking on GK20A. The first two patches
> >>> touch
> >>> the clock subsystem to allow GK20A to operate, by making the presence of
> >>> the
> >>> thermal and voltage devices optional, and allowing pstates to be provided
> >>> directly instead of being probed using the BIOS (which Tegra does not
> >>> have).
> >>>
> >>> The last patch adds the GK20A clock device. Arguably the clock can be
> >>> seen as a
> >>> stripped-down version of what is seen on NVE0, however instead of using
> >>> NVE0
> >>> support has been written from scratch using the ChromeOS kernel as a
> >>> basis.
> >>> There are several reasons for this:
> >>>
> >>> - The ChromeOS driver uses a lookup table for the P coefficient which I
> >>> could
> >>>not find in the NVE0 driver,
> >>> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
> >>>0x137120 and 0x137140),
> >>> - Calculation of MNP is done differently from what is performed in
> >>>nva3_pll_calc(), and it might be interesting to compare the two
> >>> methods,
> >>> - All the same, the programming sequence is done differently in the
> >>> ChromeOS
> >>>driver and NVE0 could possibly benefit from it (?)
> >>>
> >>> It would be interesting to try and merge both, but for now I prefer to
> >>> have the
> >>> two coexisting to ensure proper operation on GK20A and besure I don't
> >>> break
> >>> dGPU support. :)
> >>>
> >>> Regarding the first patch, one might argue that I could as well add
> >>> thermal
> >>> and voltage devices to GK20A. The reason this is not done is because
> >>> these
> >>> currently depend heavily on the presence of a BIOS, and will require a
> >>> rework
> >>> similar to that done in patch 2 for clocks. I would like to make sure
> >>> this
> >>> approach is approved because applying it to other subdevs.
> >>
> >>
> >> I think this should use CCF so we can use pre and post rate change
> >> notifiers
> >> to hookup vdd_gpu DVS.
> >
> >
> > Do you mean that we should turn the Nouveau gk20a clock driver into a
> > consumer of this CCF clock? I have nothing against this, but note that
> > Nouveau can also perform DVS on its own, as the pstates can also contain a
> > voltage to be applied to the volt device (not yet implemented in this
> > series).
> >
> > The question then becomes whether we want an additional layer of abstraction
> > on these devices and whether the pre/post rate change notifiers give us any
> > advantage compared to what Nouveau currently proposes.
> I had a brief look at this, and personally I don't think the CCF is a
> very good match at all for how we're *supposed* to manage clock
> frequencies as described by a discrete GPU VBIOS, and especially for
> when we get to the point of using the PMU falcon to coordinate all the
> various bits and pieces that go towards power management.
> 

For all I can see, the PMU is not involved in the mechanics of GPU frequency
scaling on Tegra.

Cheers,

Peter.


[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Peter De Schrijver
On Fri, Jul 11, 2014 at 03:49:06AM +0200, Alex Courbot wrote:
> On 07/10/2014 06:43 PM, Peter De Schrijver wrote:
> > On Thu, Jul 10, 2014 at 09:34:34AM +0200, Alexandre Courbot wrote:
> >> This series adds support for reclocking on GK20A. The first two patches 
> >> touch
> >> the clock subsystem to allow GK20A to operate, by making the presence of 
> >> the
> >> thermal and voltage devices optional, and allowing pstates to be provided
> >> directly instead of being probed using the BIOS (which Tegra does not 
> >> have).
> >>
> >> The last patch adds the GK20A clock device. Arguably the clock can be seen 
> >> as a
> >> stripped-down version of what is seen on NVE0, however instead of using 
> >> NVE0
> >> support has been written from scratch using the ChromeOS kernel as a basis.
> >> There are several reasons for this:
> >>
> >> - The ChromeOS driver uses a lookup table for the P coefficient which I 
> >> could
> >>not find in the NVE0 driver,
> >> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
> >>0x137120 and 0x137140),
> >> - Calculation of MNP is done differently from what is performed in
> >>nva3_pll_calc(), and it might be interesting to compare the two methods,
> >> - All the same, the programming sequence is done differently in the 
> >> ChromeOS
> >>driver and NVE0 could possibly benefit from it (?)
> >>
> >> It would be interesting to try and merge both, but for now I prefer to 
> >> have the
> >> two coexisting to ensure proper operation on GK20A and besure I don't break
> >> dGPU support. :)
> >>
> >> Regarding the first patch, one might argue that I could as well add thermal
> >> and voltage devices to GK20A. The reason this is not done is because these
> >> currently depend heavily on the presence of a BIOS, and will require a 
> >> rework
> >> similar to that done in patch 2 for clocks. I would like to make sure this
> >> approach is approved because applying it to other subdevs.
> >
> > I think this should use CCF so we can use pre and post rate change notifiers
> > to hookup vdd_gpu DVS.
> 
> Do you mean that we should turn the Nouveau gk20a clock driver into a 
> consumer of this CCF clock? I have nothing against this, but note that 
> Nouveau can also perform DVS on its own, as the pstates can also contain 
> a voltage to be applied to the volt device (not yet implemented in this 
> series).
> 

Yes. For Tegra I think it makes sense to move DVS out of the individual
drivers. Then we can share the code which has to deal with building the OPP
tables with other DVS rails (eg. vdd_core) for example. Often there are also
chip specific quirks to be dealt with (such as the maximum allowed voltage step
or voltage relationships between rails), which are easier to handle in common
code.

Cheers,

Peter.


[Bug 80868] Support screen scaling modes for external monitors

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=80868

--- Comment #15 from Alex Deucher  ---
I've added the patch to my 3.17 queue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/8599560b/attachment-0001.html>


[Bug 80868] Support screen scaling modes for external monitors

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=80868

--- Comment #14 from Kamil P?ral  ---
(In reply to comment #13)
> Created attachment 102565 [details] [review]
> better patch
> 
> This version handles the no modes case properly.

I re-tried it, still works for me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/51174227/attachment.html>


[Bug 78096] Linux 3.14 doesn't boot with Radeon HD 5870

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78096

--- Comment #8 from Alex Deucher  ---
Any progress on the bisecting?  Can you try Linus git tree?  There were several
evergreen fixes that went into 3.16 that may help.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/6fc31fc9/attachment.html>


[Bug 78096] Linux 3.14 doesn't boot with Radeon HD 5870

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78096

--- Comment #7 from stephane.raimbault at gmail.com ---
My previous comment was wrong the bug isn't fixed at all with recent kernel and
still present.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/170bc302/attachment.html>


[Bug 78096] Linux 3.14 doesn't boot with Radeon HD 5870

2014-07-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=78096

stephane.raimbault at gmail.com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #6 from stephane.raimbault at gmail.com ---
(In reply to comment #5)
> Fixed with Linux 3.15.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140711/d0f33c01/attachment.html>


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
> This patch adds the code base of the hsa driver for
> AMD's GPUs.
> 
> This driver is called kfd.
> 
> This initial version supports the first HSA chip, Kaveri.
> 
> This driver is located in a new directory structure under drivers/gpu.
> 
> Signed-off-by: Oded Gabbay 

There is too coding style issues. While we have been lax on the enforcing the
scripts/checkpatch.pl rules i think there is a limit to that. I am not strict
on the 80chars per line but others things needs fixing so we stay inline.

Also i am a bit worried about the license, given top comment in each of the
files i am not sure this is GPL2 compatible. I would need to ask lawyer to
review that.

Others comment inline.


> ---
>  drivers/Kconfig|2 +
>  drivers/gpu/Makefile   |1 +
>  drivers/gpu/hsa/Kconfig|   20 +
>  drivers/gpu/hsa/Makefile   |1 +
>  drivers/gpu/hsa/radeon/Makefile|8 +
>  drivers/gpu/hsa/radeon/kfd_chardev.c   |  133 
>  drivers/gpu/hsa/radeon/kfd_crat.h  |  292 
>  drivers/gpu/hsa/radeon/kfd_device.c|  162 +
>  drivers/gpu/hsa/radeon/kfd_module.c|  117 
>  drivers/gpu/hsa/radeon/kfd_pasid.c |   92 +++
>  drivers/gpu/hsa/radeon/kfd_priv.h  |  232 ++
>  drivers/gpu/hsa/radeon/kfd_process.c   |  400 +++
>  drivers/gpu/hsa/radeon/kfd_scheduler.h |   62 ++
>  drivers/gpu/hsa/radeon/kfd_topology.c  | 1201 
> 
>  drivers/gpu/hsa/radeon/kfd_topology.h  |  168 +
>  15 files changed, 2891 insertions(+)
>  create mode 100644 drivers/gpu/hsa/Kconfig
>  create mode 100644 drivers/gpu/hsa/Makefile
>  create mode 100644 drivers/gpu/hsa/radeon/Makefile
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_chardev.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_crat.h
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_device.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_module.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_pasid.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_priv.h
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_process.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_scheduler.h
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_topology.c
>  create mode 100644 drivers/gpu/hsa/radeon/kfd_topology.h
> 
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 9b2dcc2..c1ac8f8 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -178,4 +178,6 @@ source "drivers/mcb/Kconfig"
>  
>  source "drivers/thunderbolt/Kconfig"
>  
> +source "drivers/gpu/hsa/Kconfig"
> +
>  endmenu
> diff --git a/drivers/gpu/Makefile b/drivers/gpu/Makefile
> index 70da9eb..749a7ea 100644
> --- a/drivers/gpu/Makefile
> +++ b/drivers/gpu/Makefile
> @@ -1,3 +1,4 @@
>  obj-y+= drm/ vga/
>  obj-$(CONFIG_TEGRA_HOST1X)   += host1x/
>  obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/
> +obj-$(CONFIG_HSA)+= hsa/
> \ No newline at end of file
> diff --git a/drivers/gpu/hsa/Kconfig b/drivers/gpu/hsa/Kconfig
> new file mode 100644
> index 000..ee7bb28
> --- /dev/null
> +++ b/drivers/gpu/hsa/Kconfig
> @@ -0,0 +1,20 @@
> +#
> +# Heterogenous system architecture configuration
> +#
> +
> +menuconfig HSA
> + bool "Heterogenous System Architecture"
> + default y
> + help
> +   Say Y here if you want Heterogenous System Architecture support.

Maybe a bit more chatty here, there is already enough kernel option that
are cryptic even to kernel developer. Not everyone is well aware of all
the fence 3 letter accronym GPU uses :)

> +
> +if HSA
> +
> +config HSA_RADEON
> + tristate "HSA kernel driver for AMD Radeon devices"
> + depends on HSA && AMD_IOMMU_V2 && X86_64
> + default m
> + help
> +   Enable this if you want to support HSA on AMD Radeon devices.
> +
> +endif # HSA
> diff --git a/drivers/gpu/hsa/Makefile b/drivers/gpu/hsa/Makefile
> new file mode 100644
> index 000..0951584
> --- /dev/null
> +++ b/drivers/gpu/hsa/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_HSA_RADEON) += radeon/
> diff --git a/drivers/gpu/hsa/radeon/Makefile b/drivers/gpu/hsa/radeon/Makefile
> new file mode 100644
> index 000..ba16a09
> --- /dev/null
> +++ b/drivers/gpu/hsa/radeon/Makefile
> @@ -0,0 +1,8 @@
> +#
> +# Makefile for Heterogenous System Architecture support for AMD Radeon 
> devices
> +#
> +
> +radeon_kfd-y := kfd_module.o kfd_device.o kfd_chardev.o \
> + kfd_pasid.o kfd_topology.o kfd_process.o
> +
> +obj-$(CONFIG_HSA_RADEON) += radeon_kfd.o
> diff --git a/drivers/gpu/hsa/radeon/kfd_chardev.c 
> b/drivers/gpu/hsa/radeon/kfd_chardev.c
> new file mode 100644
> index 000..7a56a8f
> --- /dev/null
> +++ b/drivers/gpu/hsa/radeon/kfd_chardev.c
> @@ -0,0 +1,133 @@
> +/*
> + * Copyright 2014 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated 

[Nouveau] [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices

2014-07-11 Thread Ben Skeggs
On Fri, Jul 11, 2014 at 12:35 PM, Alexandre Courbot  
wrote:
> On 07/10/2014 09:58 PM, Daniel Vetter wrote:
>>
>> On Tue, Jul 08, 2014 at 05:25:57PM +0900, Alexandre Courbot wrote:
>>>
>>> page_to_phys() is not the correct way to obtain the DMA address of a
>>> buffer on a non-PCI system. Use the DMA API functions for this, which
>>> are portable and will allow us to use other DMA API functions for
>>> buffer synchronization.
>>>
>>> Signed-off-by: Alexandre Courbot 
>>> ---
>>>   drivers/gpu/drm/nouveau/core/engine/device/base.c | 8 +++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/base.c
>>> b/drivers/gpu/drm/nouveau/core/engine/device/base.c
>>> index 18c8c7245b73..e4e9e64988fe 100644
>>> --- a/drivers/gpu/drm/nouveau/core/engine/device/base.c
>>> +++ b/drivers/gpu/drm/nouveau/core/engine/device/base.c
>>> @@ -489,7 +489,10 @@ nv_device_map_page(struct nouveau_device *device,
>>> struct page *page)
>>> if (pci_dma_mapping_error(device->pdev, ret))
>>> ret = 0;
>>> } else {
>>> -   ret = page_to_phys(page);
>>> +   ret = dma_map_page(>platformdev->dev, page, 0,
>>> +  PAGE_SIZE, DMA_BIDIRECTIONAL);
>>> +   if (dma_mapping_error(>platformdev->dev, ret))
>>> +   ret = 0;
>>> }
>>>
>>> return ret;
>>> @@ -501,6 +504,9 @@ nv_device_unmap_page(struct nouveau_device *device,
>>> dma_addr_t addr)
>>> if (nv_device_is_pci(device))
>>> pci_unmap_page(device->pdev, addr, PAGE_SIZE,
>>>PCI_DMA_BIDIRECTIONAL);
>>
>>
>> pci_map/unmap alias to dma_unmap/map when called on the underlying struct
>> device embedded in pci_device (like for platform drivers). Dunno whether
>> it's worth to track a pointer to the struct device directly and always
>> call dma_unmap/map.
>
>
> Isn't it (theoretically) possible to have a platform that does not use the
> DMA API for its PCI implementation and thus requires the pci_* functions to
> be called? I could not find such a case in -next, which suggests that all
> PCI platforms have been converted to the DMA API already and that we could
> indeed refactor this to always use the DMA functions.
>
> But at the same time the way we use APIs should not be directed by their
> implementation, but by their intent - and unless the PCI API has been
> deprecated in some way (something I am not aware of), the rule is still that
> you should use it on a PCI device.
>
>
>>
>> Just drive-by comment since I'm interested in how you solve this - i915
>> has similar fun with buffer sharing and coherent and non-coherent
>> platforms. Although we don't have fun with pci and non-pci based
>> platforms.
>
>
> Yeah, I am not familiar with i915 but it seems like we are on a similar boat
> here (excepted ARM is more constrained as to its memory mappings). The
> strategy in this series is, map buffers used by user-space cached and
> explicitly synchronize them (since the ownership transition from user to GPU
> is always clearly performed by syscalls), and use coherent mappings for
> buffers used by the kernel which are accessed more randomly. This has solved
> all our coherency issues and resulted in the best performance so far.
I wonder if we might want to use unsnooped cached mappings of pages on
non-ARM platforms also, to avoid the overhead of the cache snooping?

>
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Joe Perches
On Fri, 2014-07-11 at 15:22 -0400, Jerome Glisse wrote:
> Just to be explicit, my point is that is you claim GPL in MODULE_LICENSE
> then this is a GPL licensed code, if you claim GPL with additional rights
> than this is dual licensed code. This is how i read and interpret this
> with additional rights. In all the case the radeon code is considered
> dual license ie GPL+BSD (at least this is how i consider that code).

This is pretty common:

MODULE_LICENSE("Dual BSD/GPL");

There are a couple hundred of them.




[RESEND PATCH v3 06/11] drm: add DT bindings documentation for atmel-hlcdc-dc driver

2014-07-11 Thread Laurent Pinchart
Hi Boris,

On Thursday 10 July 2014 14:56:26 Boris BREZILLON wrote:
> On Thu, 10 Jul 2014 13:16:21 +0200 Laurent Pinchart wrote:
> > On Monday 07 July 2014 18:42:59 Boris BREZILLON wrote:
> > > The Atmel HLCDC (HLCD Controller) IP available on some Atmel SoCs (i.e.
> > > at91sam9n12, at91sam9x5 family or sama5d3 family) provides a display
> > > controller device.
> > > 
> > > The HLCDC block provides a single RGB output port, and only supports LCD
> > > panels connection to LCD panels for now.
> > > 
> > > The atmel,panel property link the HLCDC RGB output with the LCD panel
> > > connected on this port (note that the HLCDC RGB connector implementation
> > > makes use of the DRM panel framework).
> > > 
> > > Connection to other external devices (DRM bridges) might be added later
> > > by mean of a new atmel,xxx (atmel,bridge) property.
> > > 
> > > Signed-off-by: Boris BREZILLON 
> > > ---
> > > 
> > >  .../devicetree/bindings/drm/atmel-hlcdc-dc.txt | 59 +++
> > >  1 file changed, 59 insertions(+)
> > >  create mode 100644
> > >  Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > 
> > > diff --git a/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt new file mode
> > > 100644
> > > index 000..594bdb2
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/drm/atmel-hlcdc-dc.txt
> > > @@ -0,0 +1,59 @@
> > > +Device-Tree bindings for Atmel's HLCDC (High LCD Controller) DRM driver
> > > +
> > > +The Atmel HLCDC Display Controller is subdevice of the HLCDC MFD
> > > device.
> > > +See Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt for more
> > > details.
> > > +
> > > +Required properties:
> > > + - compatible: value should be one of the following:
> > > +   "atmel,hlcdc-dc"
> > > + - interrupts: the HLCDC interrupt definition
> > > + - pinctrl-names: the pin control state names. Should contain
> > > "default",
> > > +   "rgb-444", "rgb-565", "rgb-666" and "rgb-888".
> > > + - pinctrl-[0-4]: should contain the pinctrl states described by
> > > pinctrl
> > > +   names.
> > 
> > Do you need to switch between the different pinctrl configurations at
> > runtime, or is the configuration selected from the panel type, which
> > doesn't change ?
>
> At the moment no, but if we ever need to support different devices on
> the same RGB connector (actually Atmel's sama5d3xek boards have an
> RGB to HDMI bridge connected on the same RGB connector) and these
> devices do not support the same RGB mode (say your LCD panel supports
> RGB888 and your RGB to HDMI bridge supports RGB555), then depending on
> the output you select you'll have to change your pinctrl config at
> runtime.

Just to make sure I understand the use case correctly, are you talking about 
two devices (for example an RGB666 panel and an RGB888 RGB to HDMI bridge) 
connected to the same output, with the ability to switch between the two at 
runtime ? That's a valid case (on a side note we shouldn't forget that the 
option of using both devices at the same time should be supported as well), 
but I would probably go for a fixed pinctrl configuration that supports both, 
although switching configurations at runtime would be a micro-optimization 
that might make sense.

> I'd say we could get rid of this runtime pinctrl config as a first step
> if DT ABI stability was not required.
> But it is, and I'd like to have a future proof binding to handle these
> tricky cases when they occurs (if they ever do).

I think we have a shortcoming of the pinctrl API here in the general case. The 
API only allows you to select a single configuration per device. Imagine the 
same display controller, with two DPI outputs, each of them configurable in 
444, 565, 666 or 888 modes. With the current API we would have to create 4*4 = 
16 pinctrl configurations for all combinations. That obviously wouldn't scale, 
so we'll have to fix this eventually. From a DT stability point of view, I 
would thus avoid specifying multiple pinctrl configurations now until we come 
up with a standard way to support this use case.

> Anyway, I'm open to any other alternative that could let me add support
> for this later on.
> 
> BTW, is there any reason for not defining an RGB connector type (I'm
> currently defining HLCDC connector as an LVDS connector) ?

Not that I know of. The DRM API has been developed before display on embedded 
systems became such a hot topic. If we had to redo it today, panels might be 
exposed to userspace as such, with a connector. We have to live with the past, 
so the connector will stay, but adding a new RGB connector type could make 
sense (although we might need a different name, in a way the VGA and LVDS 
connectors also carry RGB signals).

> > > + - atmel,panel: Should contain a phandle with 2 parameters.
> > > +   The first cell is a phandle to a DRM panel device
> > > +   The second cell encodes the RGB mode, which can take the following
> > > 

[PATCH 08/83] drm/radeon: Add calls to initialize and finalize kfd from radeon

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:08AM +0300, Oded Gabbay wrote:
> The KFD driver should be loaded when the radeon driver is loaded and
> should be finalized when the radeon driver is removed.
> 
> This patch adds a function call to initialize kfd from radeon_init
> and a function call to finalize kfd from radeon_exit.
> 
> If the KFD driver is not present in the system, the initialize call
> fails and the radeon driver continues normally.
> 
> This patch also adds calls to probe, initialize and finalize a kfd device
> per radeon device using the kgd-->kfd interface.
> 
> Signed-off-by: Oded Gabbay 

It might be nice to allow to build radeon without HSA so i think an
CONFIG_HSA should be added and have other thing depends on it.
Otherwise this one is.

Reviewed-by: J?r?me Glisse 


> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 6 ++
>  drivers/gpu/drm/radeon/radeon_kms.c | 9 +
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
> b/drivers/gpu/drm/radeon/radeon_drv.c
> index cb14213..88a45a0 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -151,6 +151,9 @@ static inline void radeon_register_atpx_handler(void) {}
>  static inline void radeon_unregister_atpx_handler(void) {}
>  #endif
>  
> +extern bool radeon_kfd_init(void);
> +extern void radeon_kfd_fini(void);
> +
>  int radeon_no_wb;
>  int radeon_modeset = -1;
>  int radeon_dynclks = -1;
> @@ -630,12 +633,15 @@ static int __init radeon_init(void)
>  #endif
>   }
>  
> + radeon_kfd_init();
> +
>   /* let modprobe override vga console setting */
>   return drm_pci_init(driver, pdriver);
>  }
>  
>  static void __exit radeon_exit(void)
>  {
> + radeon_kfd_fini();
>   drm_pci_exit(driver, pdriver);
>   radeon_unregister_atpx_handler();
>  }
> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
> b/drivers/gpu/drm/radeon/radeon_kms.c
> index 35d9318..0748284 100644
> --- a/drivers/gpu/drm/radeon/radeon_kms.c
> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -34,6 +34,10 @@
>  #include 
>  #include 
>  
> +extern void radeon_kfd_device_probe(struct radeon_device *rdev);
> +extern void radeon_kfd_device_init(struct radeon_device *rdev);
> +extern void radeon_kfd_device_fini(struct radeon_device *rdev);
> +
>  #if defined(CONFIG_VGA_SWITCHEROO)
>  bool radeon_has_atpx(void);
>  #else
> @@ -63,6 +67,8 @@ int radeon_driver_unload_kms(struct drm_device *dev)
>  
>   pm_runtime_get_sync(dev->dev);
>  
> + radeon_kfd_device_fini(rdev);
> +
>   radeon_acpi_fini(rdev);
>   
>   radeon_modeset_fini(rdev);
> @@ -142,6 +148,9 @@ int radeon_driver_load_kms(struct drm_device *dev, 
> unsigned long flags)
>   "Error during ACPI methods call\n");
>   }
>  
> + radeon_kfd_device_probe(rdev);
> + radeon_kfd_device_init(rdev);
> +
>   if (radeon_is_px(dev)) {
>   pm_runtime_use_autosuspend(dev->dev);
>   pm_runtime_set_autosuspend_delay(dev->dev, 5000);
> -- 
> 1.9.1
> 


[PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking srbm_gfx_cntl register

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:07AM +0300, Oded Gabbay wrote:
> This patch adds a new interface to kfd2kgd_calls structure, which
> allows the kfd to lock and unlock the srbm_gfx_cntl register

Why does kfd needs to lock this register if kfd can not access
any of those register ? This sounds broken to me, exposing a
driver internal mutex to another driver is not something i am
fan of.

Cheers,
J?r?me

> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/drm/radeon/radeon_kfd.c | 20 
>  include/linux/radeon_kfd.h  |  4 
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
> b/drivers/gpu/drm/radeon/radeon_kfd.c
> index 66ee36b..594020e 100644
> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
> @@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem 
> *mem);
>  
>  static uint64_t get_vmem_size(struct kgd_dev *kgd);
>  
> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd);
> +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
> +
> +
>  static const struct kfd2kgd_calls kfd2kgd = {
>   .allocate_mem = allocate_mem,
>   .free_mem = free_mem,
> @@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   .kmap_mem = kmap_mem,
>   .unkmap_mem = unkmap_mem,
>   .get_vmem_size = get_vmem_size,
> + .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
> + .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
>  };
>  
>  static const struct kgd2kfd_calls *kgd2kfd;
> @@ -233,3 +239,17 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd)
>  
>   return rdev->mc.real_vram_size;
>  }
> +
> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd)
> +{
> + struct radeon_device *rdev = (struct radeon_device *)kgd;
> +
> + mutex_lock(>srbm_mutex);
> +}
> +
> +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd)
> +{
> + struct radeon_device *rdev = (struct radeon_device *)kgd;
> +
> + mutex_unlock(>srbm_mutex);
> +}
> diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
> index c7997d4..40b691c 100644
> --- a/include/linux/radeon_kfd.h
> +++ b/include/linux/radeon_kfd.h
> @@ -81,6 +81,10 @@ struct kfd2kgd_calls {
>   void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
>  
>   uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
> +
> + /* SRBM_GFX_CNTL mutex */
> + void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
> + void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>  };
>  
>  bool kgd2kfd_init(unsigned interface_version,
> -- 
> 1.9.1
> 


[PATCH 06/83] drm/radeon: Add kfd-->kgd interfaces of memory allocation/mapping

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:06AM +0300, Oded Gabbay wrote:
> This patch adds new interfaces to kfd2kgd_calls structure.
> 
> The new interfaces allow the kfd driver to :
> 
> 1. Allocated video memory through the radeon driver
> 2. Map and unmap video memory with GPUVM through the radeon driver
> 3. Map and unmap system memory with GPUVM through the radeon driver
> 
> Signed-off-by: Oded Gabbay 
> ---
>  drivers/gpu/drm/radeon/radeon_kfd.c | 129 
> 
>  include/linux/radeon_kfd.h  |  23 +++
>  2 files changed, 152 insertions(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
> b/drivers/gpu/drm/radeon/radeon_kfd.c
> index 1b859b5..66ee36b 100644
> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
> @@ -25,9 +25,31 @@
>  #include 
>  #include "radeon.h"
>  
> +struct kgd_mem {
> + struct radeon_bo *bo;
> + u32 domain;
> +};
> +
> +static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment,
> + enum kgd_memory_pool pool, struct kgd_mem **memory_handle);
> +
> +static void free_mem(struct kgd_dev *kgd, struct kgd_mem *memory_handle);
> +
> +static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t 
> *vmid0_address);
> +static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem);
> +
> +static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr);
> +static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem);
> +
>  static uint64_t get_vmem_size(struct kgd_dev *kgd);
>  
>  static const struct kfd2kgd_calls kfd2kgd = {
> + .allocate_mem = allocate_mem,
> + .free_mem = free_mem,
> + .gpumap_mem = gpumap_mem,
> + .ungpumap_mem = ungpumap_mem,
> + .kmap_mem = kmap_mem,
> + .unkmap_mem = unkmap_mem,
>   .get_vmem_size = get_vmem_size,
>  };
>  
> @@ -96,6 +118,113 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
>   }
>  }
>  
> +static u32 pool_to_domain(enum kgd_memory_pool p)
> +{
> + switch (p) {
> + case KGD_POOL_FRAMEBUFFER: return RADEON_GEM_DOMAIN_VRAM;
> + default: return RADEON_GEM_DOMAIN_GTT;
> + }
> +}
> +
> +static int allocate_mem(struct kgd_dev *kgd, size_t size, size_t alignment,
> + enum kgd_memory_pool pool, struct kgd_mem **memory_handle)
> +{
> + struct radeon_device *rdev = (struct radeon_device *)kgd;
> + struct kgd_mem *mem;
> + int r;
> +
> + mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
> + if (!mem)
> + return -ENOMEM;
> +
> + mem->domain = pool_to_domain(pool);
> +
> + r = radeon_bo_create(rdev, size, alignment, true, mem->domain, NULL, 
> >bo);
> + if (r) {
> + kfree(mem);
> + return r;
> + }
> +
> + *memory_handle = mem;
> + return 0;
> +}
> +
> +static void free_mem(struct kgd_dev *kgd, struct kgd_mem *mem)
> +{
> + /* Assume that KFD will never free gpumapped or kmapped memory. This is 
> not quite settled. */
> + radeon_bo_unref(>bo);
> + kfree(mem);
> +}
> +
> +static int gpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, uint64_t 
> *vmid0_address)
> +{
> + int r;
> +
> + r = radeon_bo_reserve(mem->bo, true);
> +
> + /*
> +  * ttm_bo_reserve can only fail if the buffer reservation lock
> +  * is held in circumstances that would deadlock
> +  */
> + BUG_ON(r != 0);
> + r = radeon_bo_pin(mem->bo, mem->domain, vmid0_address);
> + radeon_bo_unreserve(mem->bo);
> +
> + return r;
> +}

What is lifetime of such object ? Are they limited in size and number ? How
can the GFX side ie radeon force unmap them ?

Because pining is a big NO we only pin a handfull of buffer and the only
thing we allow userspace to pin are front buffer associated with crtc
which means there is a limited number of such buffer and there is a
legitimate use for pining them.

To me this looks like anyone can pin vram and thus starve the GFX side,
read DDOS.

Cheers,
J?r?me

> +
> +static void ungpumap_mem(struct kgd_dev *kgd, struct kgd_mem *mem)
> +{
> + int r;
> +
> + r = radeon_bo_reserve(mem->bo, true);
> +
> + /*
> +  * ttm_bo_reserve can only fail if the buffer reservation lock
> +  * is held in circumstances that would deadlock
> +  */
> + BUG_ON(r != 0);
> + r = radeon_bo_unpin(mem->bo);
> +
> + /*
> +  * This unpin only removed NO_EVICT placement flags
> +  * and should never fail
> +  */
> + BUG_ON(r != 0);
> + radeon_bo_unreserve(mem->bo);
> +}
> +
> +static int kmap_mem(struct kgd_dev *kgd, struct kgd_mem *mem, void **ptr)
> +{
> + int r;
> +
> + r = radeon_bo_reserve(mem->bo, true);
> +
> + /*
> +  * ttm_bo_reserve can only fail if the buffer reservation lock
> +  * is held in circumstances that would deadlock
> +  */
> + BUG_ON(r != 0);
> + r = radeon_bo_kmap(mem->bo, ptr);
> + radeon_bo_unreserve(mem->bo);
> +
> + return r;
> +}
> +
> 

[PATCH 05/83] drm/radeon: Add kfd-->kgd interface to get virtual ram size

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:05AM +0300, Oded Gabbay wrote:
> This patch adds a new interface to kfd2kgd_calls structure so that
> the kfd driver could get the virtual ram size of a specific
> radeon device.
> 
> Signed-off-by: Oded Gabbay 

What is vmem_size ? This need to be documented. I assume this is the
number of bits the gpu can handle and i would assume that the minimum
requirement is that the device have at least as many bit as the cpu ?
ie on 48bits x86-64 the hardware also needs to support that.

Otherwise this sounds like broken things can happen.

Cheers,
J?r?me

> ---
>  drivers/gpu/drm/radeon/radeon_kfd.c | 12 
>  include/linux/radeon_kfd.h  |  1 +
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
> b/drivers/gpu/drm/radeon/radeon_kfd.c
> index 7c7f808..1b859b5 100644
> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
> @@ -25,7 +25,10 @@
>  #include 
>  #include "radeon.h"
>  
> +static uint64_t get_vmem_size(struct kgd_dev *kgd);
> +
>  static const struct kfd2kgd_calls kfd2kgd = {
> + .get_vmem_size = get_vmem_size,
>  };
>  
>  static const struct kgd2kfd_calls *kgd2kfd;
> @@ -92,3 +95,12 @@ void radeon_kfd_device_fini(struct radeon_device *rdev)
>   rdev->kfd = NULL;
>   }
>  }
> +
> +static uint64_t get_vmem_size(struct kgd_dev *kgd)
> +{
> + struct radeon_device *rdev = (struct radeon_device *)kgd;
> +
> + BUG_ON(kgd == NULL);
> +
> + return rdev->mc.real_vram_size;
> +}
> diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
> index 59785e9..28cddf5 100644
> --- a/include/linux/radeon_kfd.h
> +++ b/include/linux/radeon_kfd.h
> @@ -57,6 +57,7 @@ struct kgd2kfd_calls {
>  };
>  
>  struct kfd2kgd_calls {
> + uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
>  };
>  
>  bool kgd2kfd_init(unsigned interface_version,
> -- 
> 1.9.1
> 


[PATCH 04/83] drm/radeon: Add radeon <--> kfd interface

2014-07-11 Thread Jerome Glisse
On Thu, Jul 10, 2014 at 03:38:33PM -0700, Joe Perches wrote:
> On Fri, 2014-07-11 at 00:50 +0300, Oded Gabbay wrote:
> > This patch adds the interface between the radeon driver and the kfd
> > driver. The interface implementation is contained in
> > radeon_kfd.c and radeon_kfd.h.
> []
> >  include/linux/radeon_kfd.h  | 67 ++
> 
> Is there a good reason to put this file in include/linux?
> 

Agrees, we do not want to clutter include/linux/ with specific driver
include, i think its one of the rules even thought there is some hw header
already in there.

I would rather see either a new dir include/hsa or inside include/drm.

Cheers,
J?r?me


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Alex Deucher
On Fri, Jul 11, 2014 at 12:18 PM, Christian K?nig
 wrote:
> Am 11.07.2014 18:05, schrieb Jerome Glisse:
>
>> On Fri, Jul 11, 2014 at 12:50:02AM +0300, Oded Gabbay wrote:
>>>
>>> To support HSA on KV, we need to limit the number of vmids and pipes
>>> that are available for radeon's use with KV.
>>>
>>> This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
>>> 0-7) and also makes radeon thinks that KV has only a single MEC with a
>>> single
>>> pipe in it
>>>
>>> Signed-off-by: Oded Gabbay 
>>
>> Reviewed-by: J?r?me Glisse 
>
>
> At least fro the VMIDs on demand allocation should be trivial to implement,
> so I would rather prefer this instead of a fixed assignment.

IIRC, the way the CP hw scheduler works you have to give it a range of
vmids and it assigns them dynamically as queues are mapped so
effectively they are potentially in use once the CP scheduler is set
up.

Alex


>
> Christian.
>
>
>>
>>> ---
>>>   drivers/gpu/drm/radeon/cik.c | 48
>>> ++--
>>>   1 file changed, 24 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
>>> index 4bfc2c0..e0c8052 100644
>>> --- a/drivers/gpu/drm/radeon/cik.c
>>> +++ b/drivers/gpu/drm/radeon/cik.c
>>> @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device
>>> *rdev)
>>> /*
>>>  * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
>>>  * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
>>> +* Nonetheless, we assign only 1 pipe because all other pipes
>>> will
>>> +* be handled by KFD
>>>  */
>>> -   if (rdev->family == CHIP_KAVERI)
>>> -   rdev->mec.num_mec = 2;
>>> -   else
>>> -   rdev->mec.num_mec = 1;
>>> -   rdev->mec.num_pipe = 4;
>>> +   rdev->mec.num_mec = 1;
>>> +   rdev->mec.num_pipe = 1;
>>> rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;
>>> if (rdev->mec.hpd_eop_obj == NULL) {
>>> @@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct
>>> radeon_device *rdev)
>>> /* init the pipes */
>>> mutex_lock(>srbm_mutex);
>>> -   for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
>>> -   int me = (i < 4) ? 1 : 2;
>>> -   int pipe = (i < 4) ? i : (i - 4);
>>>   - eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i *
>>> MEC_HPD_SIZE * 2);
>>> +   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;
>>>   - cik_srbm_select(rdev, me, pipe, 0, 0);
>>> +   cik_srbm_select(rdev, 0, 0, 0, 0);
>>>   - /* write the EOP addr */
>>> -   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
>>> -   WREG32(CP_HPD_EOP_BASE_ADDR_HI,
>>> upper_32_bits(eop_gpu_addr) >> 8);
>>> +   /* write the EOP addr */
>>> +   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
>>> +   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >>
>>> 8);
>>>   - /* set the VMID assigned */
>>> -   WREG32(CP_HPD_EOP_VMID, 0);
>>> +   /* set the VMID assigned */
>>> +   WREG32(CP_HPD_EOP_VMID, 0);
>>> +
>>> +   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
>>> +   tmp = RREG32(CP_HPD_EOP_CONTROL);
>>> +   tmp &= ~EOP_SIZE_MASK;
>>> +   tmp |= order_base_2(MEC_HPD_SIZE / 8);
>>> +   WREG32(CP_HPD_EOP_CONTROL, tmp);
>>>   - /* set the EOP size, register value is 2^(EOP_SIZE+1)
>>> dwords */
>>> -   tmp = RREG32(CP_HPD_EOP_CONTROL);
>>> -   tmp &= ~EOP_SIZE_MASK;
>>> -   tmp |= order_base_2(MEC_HPD_SIZE / 8);
>>> -   WREG32(CP_HPD_EOP_CONTROL, tmp);
>>> -   }
>>> -   cik_srbm_select(rdev, 0, 0, 0, 0);
>>> mutex_unlock(>srbm_mutex);
>>> /* init the queues.  Just two for now. */
>>> @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev,
>>> struct radeon_ib *ib)
>>>*/
>>>   int cik_vm_init(struct radeon_device *rdev)
>>>   {
>>> -   /* number of VMs */
>>> -   rdev->vm_manager.nvm = 16;
>>> +   /*
>>> +* number of VMs
>>> +* VMID 0 is reserved for Graphics
>>> +* radeon compute will use VMIDs 1-7
>>> +* KFD will use VMIDs 8-15
>>> +*/
>>> +   rdev->vm_manager.nvm = 8;
>>> /* base offset of vram pages */
>>> if (rdev->flags & RADEON_IS_IGP) {
>>> u64 tmp = RREG32(MC_VM_FB_OFFSET);
>>> --
>>> 1.9.1
>>>
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 03/83] drm/radeon: Report doorbell configuration to kfd

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:03AM +0300, Oded Gabbay wrote:
> Radeon and KFD share the doorbell aperture.
> Radeon sets it up, takes the doorbells required for its own rings
> and reports the setup to KFD.
> Radeon reserved doorbells are at the start of the doorbell aperture.
> 
> Signed-off-by: Oded Gabbay 

I would need some refreshing on doorbell. You want to map it to userspace
but at the same time it is use by the radeon kernel driver when dispatching
on the compute ring (iirc the gfx ring does not use it).

So now my worry is, given usermapping is done on page granularity, what
would block one process from writting to another process doorbell ? Again
iirc the doorbell is actualy the wptr for the ring buffer associated with
said doorbell (thought i forget how doorbell are associated with a ring).

This sounds really bad.

Cheers,
J?r?me

> ---
>  drivers/gpu/drm/radeon/radeon.h|  4 
>  drivers/gpu/drm/radeon/radeon_device.c | 31 +++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 7cda75d..4e7e41f 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -676,6 +676,10 @@ struct radeon_doorbell {
>  
>  int radeon_doorbell_get(struct radeon_device *rdev, u32 *page);
>  void radeon_doorbell_free(struct radeon_device *rdev, u32 doorbell);
> +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
> +   phys_addr_t *aperture_base,
> +   size_t *aperture_size,
> +   size_t *start_offset);
>  
>  /*
>   * IRQS.
> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
> b/drivers/gpu/drm/radeon/radeon_device.c
> index 03686fa..98538d2 100644
> --- a/drivers/gpu/drm/radeon/radeon_device.c
> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> @@ -328,6 +328,37 @@ void radeon_doorbell_free(struct radeon_device *rdev, 
> u32 doorbell)
>   __clear_bit(doorbell, rdev->doorbell.used);
>  }
>  
> +/**
> + * radeon_doorbell_get_kfd_info - Report doorbell configuration required to
> + *setup KFD
> + *
> + * @rdev: radeon_device pointer
> + * @aperture_base: output returning doorbell aperture base physical address
> + * @aperture_size: output returning doorbell aperture size in bytes
> + * @start_offset: output returning # of doorbell bytes reserved for radeon.
> + *
> + * Radeon and the KFD share the doorbell aperture. Radeon sets it up,
> + * takes doorbells required for its own rings and reports the setup to KFD.
> + * Radeon reserved doorbells are at the start of the doorbell aperture.
> + */
> +void radeon_doorbell_get_kfd_info(struct radeon_device *rdev,
> +   phys_addr_t *aperture_base,
> +   size_t *aperture_size,
> +   size_t *start_offset)
> +{
> + /* The first num_doorbells are used by radeon.
> +  * KFD takes whatever's left in the aperture. */
> + if (rdev->doorbell.size > rdev->doorbell.num_doorbells * sizeof(u32)) {
> + *aperture_base = rdev->doorbell.base;
> + *aperture_size = rdev->doorbell.size;
> + *start_offset = rdev->doorbell.num_doorbells * sizeof(u32);
> + } else {
> + *aperture_base = 0;
> + *aperture_size = 0;
> + *start_offset = 0;
> + }
> +}
> +
>  /*
>   * radeon_wb_*()
>   * Writeback is the the method by which the the GPU updates special pages
> -- 
> 1.9.1
> 


[PATCH 8/8] drm/tilcdc: panel: Add support for enable GPIO

2014-07-11 Thread Fabio Estevam
On Fri, Jul 11, 2014 at 11:18 AM, Ezequiel Garcia
 wrote:
> In order to support the "enable GPIO" available in many panel devices,
> this commit adds a proper devicetree binding.
>
> By providing an enable GPIO in the devicetree, the driver can now turn
> off and on the panel device, and/or the backlight device. Both the
> backlight and the GPIO are optional properties.
> +   panel_mod->enable_gpio = devm_gpiod_get(>dev, "enable");
> +   if (IS_ERR(panel_mod->enable_gpio)) {
> +   ret = PTR_ERR(panel_mod->enable_gpio);
> +   if (ret != -ENOENT) {

Shouldn't this be controlled by a regulator instead? What if the panel
is powered from a PMIC output?


[RFC PATCH] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-11 Thread Rob Clark
On Fri, Jul 11, 2014 at 11:47 AM, Boris BREZILLON
 wrote:
> On Fri, 11 Jul 2014 11:41:12 -0400
> Rob Clark  wrote:
>
>> On Fri, Jul 11, 2014 at 11:17 AM, Boris BREZILLON
>>  wrote:
>> > Make use of lists instead of kfifo in order to dynamically allocate
>> > task entry when someone require some delayed work, and thus preventing
>> > drm_flip_work_queue from directly calling func instead of queuing this
>> > call.
>> > This allow drm_flip_work_queue to be safely called even within irq
>> > handlers.
>> >
>> > Add new helper functions to allocate a flip work task and queue it when
>> > needed. This prevents allocating data within irq context (which might
>> > impact the time spent in the irq handler).
>> >
>> > Signed-off-by: Boris BREZILLON 
>> > ---
>> > Hi Rob,
>> >
>> > This is a proposal for what you suggested (dynamically growing the drm
>> > flip work queue in order to avoid direct call of work->func when calling
>> > drm_flip_work_queue).
>> >
>> > I'm not sure this is exactly what you expected, because I'm now using
>> > lists instead of kfifo (and thus lose the lockless part), but at least
>> > we can now safely call drm_flip_work_queue or drm_flip_work_queue_task
>> > from irq handlers :-).
>> >
>> > You were also worried about queueing the same framebuffer multiple times
>> > and with this implementation you shouldn't have any problem (at least with
>> > drm_flip_work_queue, what people do with drm_flip_work_queue_task is their
>> > own responsability, but they should allocate one task for each operation
>> > even if they are manipulating the same framebuffer).
>>
>> yeah, if we are dynamically allocating the list nodes, that solves the
>> queuing-up-multiple-times issue..
>>
>> I wonder if drm_flip_work_allocate_task() should use GPF_ATOMIC when
>> allocating?
>
> That's funny, I was actually modifying the API to pass gfp_t flags to
> this function ;-)

yeah, I think passing gfp flags is the better idea

>> I guess maybe it is possible to pre-allocate the task
>> from non-irq context, and then queue it from irq context.. it makes
>> the API a bit more complex, but there are only a couple users
>> currently, so I suppose this should be doable.
>
> I tried to keep the existing API so that existing users won't see the
> difference (I guess none of them are calling drm_flip_work_queue).

we do have existing users that call drm_flip_work_queue() from irq..
but I suppose adding gfp flags arg to drm_flip_work_queue() seems like
a reasonable solution.

BR,
-R

> I just added the drm_flip_work_allocate_task and
> drm_flip_work_queue_task for those who want more control on the
> queuing process.
>
> Best Regards,
>
> Boris
>
>
>
>
>
> --
> Boris Brezillon, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:50:02AM +0300, Oded Gabbay wrote:
> To support HSA on KV, we need to limit the number of vmids and pipes
> that are available for radeon's use with KV.
> 
> This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
> 0-7) and also makes radeon thinks that KV has only a single MEC with a single
> pipe in it
> 
> Signed-off-by: Oded Gabbay 

Reviewed-by: J?r?me Glisse 

> ---
>  drivers/gpu/drm/radeon/cik.c | 48 
> ++--
>  1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
> index 4bfc2c0..e0c8052 100644
> --- a/drivers/gpu/drm/radeon/cik.c
> +++ b/drivers/gpu/drm/radeon/cik.c
> @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device *rdev)
>   /*
>* KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
>* CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
> +  * Nonetheless, we assign only 1 pipe because all other pipes will
> +  * be handled by KFD
>*/
> - if (rdev->family == CHIP_KAVERI)
> - rdev->mec.num_mec = 2;
> - else
> - rdev->mec.num_mec = 1;
> - rdev->mec.num_pipe = 4;
> + rdev->mec.num_mec = 1;
> + rdev->mec.num_pipe = 1;
>   rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;
>  
>   if (rdev->mec.hpd_eop_obj == NULL) {
> @@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct radeon_device 
> *rdev)
>  
>   /* init the pipes */
>   mutex_lock(>srbm_mutex);
> - for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
> - int me = (i < 4) ? 1 : 2;
> - int pipe = (i < 4) ? i : (i - 4);
>  
> - eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 
> 2);
> + eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;
>  
> - cik_srbm_select(rdev, me, pipe, 0, 0);
> + cik_srbm_select(rdev, 0, 0, 0, 0);
>  
> - /* write the EOP addr */
> - WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
> - WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 
> 8);
> + /* write the EOP addr */
> + WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
> + WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8);
>  
> - /* set the VMID assigned */
> - WREG32(CP_HPD_EOP_VMID, 0);
> + /* set the VMID assigned */
> + WREG32(CP_HPD_EOP_VMID, 0);
> +
> + /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
> + tmp = RREG32(CP_HPD_EOP_CONTROL);
> + tmp &= ~EOP_SIZE_MASK;
> + tmp |= order_base_2(MEC_HPD_SIZE / 8);
> + WREG32(CP_HPD_EOP_CONTROL, tmp);
>  
> - /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
> - tmp = RREG32(CP_HPD_EOP_CONTROL);
> - tmp &= ~EOP_SIZE_MASK;
> - tmp |= order_base_2(MEC_HPD_SIZE / 8);
> - WREG32(CP_HPD_EOP_CONTROL, tmp);
> - }
> - cik_srbm_select(rdev, 0, 0, 0, 0);
>   mutex_unlock(>srbm_mutex);
>  
>   /* init the queues.  Just two for now. */
> @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev, struct 
> radeon_ib *ib)
>   */
>  int cik_vm_init(struct radeon_device *rdev)
>  {
> - /* number of VMs */
> - rdev->vm_manager.nvm = 16;
> + /*
> +  * number of VMs
> +  * VMID 0 is reserved for Graphics
> +  * radeon compute will use VMIDs 1-7
> +  * KFD will use VMIDs 8-15
> +  */
> + rdev->vm_manager.nvm = 8;
>   /* base offset of vram pages */
>   if (rdev->flags & RADEON_IS_IGP) {
>   u64 tmp = RREG32(MC_VM_FB_OFFSET);
> -- 
> 1.9.1
> 


[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Ben Skeggs
On Fri, Jul 11, 2014 at 11:49 AM, Alexandre Courbot  
wrote:
> On 07/10/2014 06:43 PM, Peter De Schrijver wrote:
>>
>> On Thu, Jul 10, 2014 at 09:34:34AM +0200, Alexandre Courbot wrote:
>>>
>>> This series adds support for reclocking on GK20A. The first two patches
>>> touch
>>> the clock subsystem to allow GK20A to operate, by making the presence of
>>> the
>>> thermal and voltage devices optional, and allowing pstates to be provided
>>> directly instead of being probed using the BIOS (which Tegra does not
>>> have).
>>>
>>> The last patch adds the GK20A clock device. Arguably the clock can be
>>> seen as a
>>> stripped-down version of what is seen on NVE0, however instead of using
>>> NVE0
>>> support has been written from scratch using the ChromeOS kernel as a
>>> basis.
>>> There are several reasons for this:
>>>
>>> - The ChromeOS driver uses a lookup table for the P coefficient which I
>>> could
>>>not find in the NVE0 driver,
>>> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
>>>0x137120 and 0x137140),
>>> - Calculation of MNP is done differently from what is performed in
>>>nva3_pll_calc(), and it might be interesting to compare the two
>>> methods,
>>> - All the same, the programming sequence is done differently in the
>>> ChromeOS
>>>driver and NVE0 could possibly benefit from it (?)
>>>
>>> It would be interesting to try and merge both, but for now I prefer to
>>> have the
>>> two coexisting to ensure proper operation on GK20A and besure I don't
>>> break
>>> dGPU support. :)
>>>
>>> Regarding the first patch, one might argue that I could as well add
>>> thermal
>>> and voltage devices to GK20A. The reason this is not done is because
>>> these
>>> currently depend heavily on the presence of a BIOS, and will require a
>>> rework
>>> similar to that done in patch 2 for clocks. I would like to make sure
>>> this
>>> approach is approved because applying it to other subdevs.
>>
>>
>> I think this should use CCF so we can use pre and post rate change
>> notifiers
>> to hookup vdd_gpu DVS.
>
>
> Do you mean that we should turn the Nouveau gk20a clock driver into a
> consumer of this CCF clock? I have nothing against this, but note that
> Nouveau can also perform DVS on its own, as the pstates can also contain a
> voltage to be applied to the volt device (not yet implemented in this
> series).
>
> The question then becomes whether we want an additional layer of abstraction
> on these devices and whether the pre/post rate change notifiers give us any
> advantage compared to what Nouveau currently proposes.
I had a brief look at this, and personally I don't think the CCF is a
very good match at all for how we're *supposed* to manage clock
frequencies as described by a discrete GPU VBIOS, and especially for
when we get to the point of using the PMU falcon to coordinate all the
various bits and pieces that go towards power management.

>
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Nouveau] [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices

2014-07-11 Thread Alexandre Courbot
On 07/11/2014 11:50 AM, Ben Skeggs wrote:
> On Fri, Jul 11, 2014 at 12:35 PM, Alexandre Courbot  
> wrote:
>> On 07/10/2014 09:58 PM, Daniel Vetter wrote:
>>>
>>> On Tue, Jul 08, 2014 at 05:25:57PM +0900, Alexandre Courbot wrote:

 page_to_phys() is not the correct way to obtain the DMA address of a
 buffer on a non-PCI system. Use the DMA API functions for this, which
 are portable and will allow us to use other DMA API functions for
 buffer synchronization.

 Signed-off-by: Alexandre Courbot 
 ---
drivers/gpu/drm/nouveau/core/engine/device/base.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/nouveau/core/engine/device/base.c
 b/drivers/gpu/drm/nouveau/core/engine/device/base.c
 index 18c8c7245b73..e4e9e64988fe 100644
 --- a/drivers/gpu/drm/nouveau/core/engine/device/base.c
 +++ b/drivers/gpu/drm/nouveau/core/engine/device/base.c
 @@ -489,7 +489,10 @@ nv_device_map_page(struct nouveau_device *device,
 struct page *page)
  if (pci_dma_mapping_error(device->pdev, ret))
  ret = 0;
  } else {
 -   ret = page_to_phys(page);
 +   ret = dma_map_page(>platformdev->dev, page, 0,
 +  PAGE_SIZE, DMA_BIDIRECTIONAL);
 +   if (dma_mapping_error(>platformdev->dev, ret))
 +   ret = 0;
  }

  return ret;
 @@ -501,6 +504,9 @@ nv_device_unmap_page(struct nouveau_device *device,
 dma_addr_t addr)
  if (nv_device_is_pci(device))
  pci_unmap_page(device->pdev, addr, PAGE_SIZE,
 PCI_DMA_BIDIRECTIONAL);
>>>
>>>
>>> pci_map/unmap alias to dma_unmap/map when called on the underlying struct
>>> device embedded in pci_device (like for platform drivers). Dunno whether
>>> it's worth to track a pointer to the struct device directly and always
>>> call dma_unmap/map.
>>
>>
>> Isn't it (theoretically) possible to have a platform that does not use the
>> DMA API for its PCI implementation and thus requires the pci_* functions to
>> be called? I could not find such a case in -next, which suggests that all
>> PCI platforms have been converted to the DMA API already and that we could
>> indeed refactor this to always use the DMA functions.
>>
>> But at the same time the way we use APIs should not be directed by their
>> implementation, but by their intent - and unless the PCI API has been
>> deprecated in some way (something I am not aware of), the rule is still that
>> you should use it on a PCI device.
>>
>>
>>>
>>> Just drive-by comment since I'm interested in how you solve this - i915
>>> has similar fun with buffer sharing and coherent and non-coherent
>>> platforms. Although we don't have fun with pci and non-pci based
>>> platforms.
>>
>>
>> Yeah, I am not familiar with i915 but it seems like we are on a similar boat
>> here (excepted ARM is more constrained as to its memory mappings). The
>> strategy in this series is, map buffers used by user-space cached and
>> explicitly synchronize them (since the ownership transition from user to GPU
>> is always clearly performed by syscalls), and use coherent mappings for
>> buffers used by the kernel which are accessed more randomly. This has solved
>> all our coherency issues and resulted in the best performance so far.
> I wonder if we might want to use unsnooped cached mappings of pages on
> non-ARM platforms also, to avoid the overhead of the cache snooping?

You might want to indeed, now that coherency is guaranteed by the sync 
functions originally introduced by Lucas. The only issue I could see is 
that they always invalidate the full buffer whereas bus snooping only 
affects pages that are actually touched. Someone would need to try this 
on a desktop machine and see how it affects performance.

I'd be all for it though, since it would also allow us to get rid of 
this ungraceful nv_device_is_cpu_coherent() function and result in 
simplifying nouveau_bo.c a bit.


[Nouveau] [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices

2014-07-11 Thread Lucas Stach
Am Freitag, den 11.07.2014, 11:57 +0900 schrieb Alexandre Courbot:
[...]
> >> Yeah, I am not familiar with i915 but it seems like we are on a similar 
> >> boat
> >> here (excepted ARM is more constrained as to its memory mappings). The
> >> strategy in this series is, map buffers used by user-space cached and
> >> explicitly synchronize them (since the ownership transition from user to 
> >> GPU
> >> is always clearly performed by syscalls), and use coherent mappings for
> >> buffers used by the kernel which are accessed more randomly. This has 
> >> solved
> >> all our coherency issues and resulted in the best performance so far.
> > I wonder if we might want to use unsnooped cached mappings of pages on
> > non-ARM platforms also, to avoid the overhead of the cache snooping?
> 
> You might want to indeed, now that coherency is guaranteed by the sync 
> functions originally introduced by Lucas. The only issue I could see is 
> that they always invalidate the full buffer whereas bus snooping only 
> affects pages that are actually touched. Someone would need to try this 
> on a desktop machine and see how it affects performance.
> 
> I'd be all for it though, since it would also allow us to get rid of 
> this ungraceful nv_device_is_cpu_coherent() function and result in 
> simplifying nouveau_bo.c a bit.

This will need some testing to get hard numbers, but I suspect that
invalidating the whole buffer isn't to bad as the prefetch machinery
works very well with the access patterns we see in graphics drivers.

Flushing out the whole buffer should be even less problematic, as it
will only flush out dirty lines that would need to be flushed on GPU
read snooping anyways.

In the long run we might want a separate cpu prepare/finish ioctl where
we can indicate the area of interest. This might help to avoid some of
the invalidate overhead especially for userspace suballocated buffers.

Regards,
Lucas

-- 
Pengutronix e.K. | Lucas Stach |
Industrial Linux Solutions   | http://www.pengutronix.de/  |



drm/vmwgfx: Fix compat shader namespace

2014-07-11 Thread Dan Carpenter
On Fri, Jul 11, 2014 at 12:14:25AM +0200, Thomas Hellstr?m wrote:
> 
> I agree, however that readability may be more important than the
> fastpath execution speed gained from this. But in my case not so
> much that I spontaneously feel like removing all branch prediction
> hints from the vmwgfx driver.

I'm just saying, let's not add new ones.

> 
> >
> >There are two rules of thumb for likely/unlikely:
> >
> > 1) Don't use it in the drivers/ directory.
> > 2) Or don't use it without benchmarking it.
> 
> Could you point me to a document or some sort of reference?

http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2012-March/025252.html

regards,
dan carpenter



[PATCH 01/83] mm: Add kfd_process pointer to mm_struct

2014-07-11 Thread Jerome Glisse
On Fri, Jul 11, 2014 at 12:47:26AM +0300, Oded Gabbay wrote:
> This patch enables the KFD to retrieve the kfd_process
> object from the process's mm_struct. This is needed because kfd_process
> lifespan is bound to the process's mm_struct lifespan.
> 
> When KFD is notified about an mm_struct tear-down, it checks if the
> kfd_process pointer is valid. If so, it releases the kfd_process object
> and all relevant resources.
> 
> Signed-off-by: Oded Gabbay 
> ---
>  include/linux/mm_types.h | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 678097c..6179107 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -20,6 +20,10 @@
>  struct hmm;
>  #endif
>  
> +#ifdef CONFIG_HSA_RADEON
> +struct kfd_process;
> +#endif
> +
>  #ifndef AT_VECTOR_SIZE_ARCH
>  #define AT_VECTOR_SIZE_ARCH 0
>  #endif
> @@ -439,6 +443,16 @@ struct mm_struct {
>*/
>   struct hmm *hmm;
>  #endif
> +#if defined(CONFIG_HSA_RADEON) || defined(CONFIG_HSA_RADEON_MODULE)
> + /*
> +  * kfd always register an mmu_notifier we rely on mmu notifier to keep
> +  * refcount on mm struct as well as forbiding registering kfd on a
> +  * dying mm
> +  *
> +  * This field is set with mmap_sem old in write mode.
> +  */
> + struct kfd_process *kfd_process;
> +#endif

I understand the need to bind kfd to mm life time but this is wrong
on several level. First we do not want per driver define flag here.
Second this should be a IOMMU/PASID pointer of some sort, i am sure
that Intel will want to add itself too to mm_struct so instead of
having each IOMMU add a pointer here, i would rather see a generic
pointer to a generic IOMMU struct and have this use generic IOMMU
code that can then call specific user dispatch function.

I know this add a layer but this is not a critical code path and
should never be.

I am adding Jesse as he might have thought on that.

So this one is NAK

Cheers,
J?r?me

>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
>   pgtable_t pmd_huge_pte; /* protected by page_table_lock */
>  #endif
> -- 
> 1.9.1
> 


[RFC PATCH] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-11 Thread Rob Clark
On Fri, Jul 11, 2014 at 11:17 AM, Boris BREZILLON
 wrote:
> Make use of lists instead of kfifo in order to dynamically allocate
> task entry when someone require some delayed work, and thus preventing
> drm_flip_work_queue from directly calling func instead of queuing this
> call.
> This allow drm_flip_work_queue to be safely called even within irq
> handlers.
>
> Add new helper functions to allocate a flip work task and queue it when
> needed. This prevents allocating data within irq context (which might
> impact the time spent in the irq handler).
>
> Signed-off-by: Boris BREZILLON 
> ---
> Hi Rob,
>
> This is a proposal for what you suggested (dynamically growing the drm
> flip work queue in order to avoid direct call of work->func when calling
> drm_flip_work_queue).
>
> I'm not sure this is exactly what you expected, because I'm now using
> lists instead of kfifo (and thus lose the lockless part), but at least
> we can now safely call drm_flip_work_queue or drm_flip_work_queue_task
> from irq handlers :-).
>
> You were also worried about queueing the same framebuffer multiple times
> and with this implementation you shouldn't have any problem (at least with
> drm_flip_work_queue, what people do with drm_flip_work_queue_task is their
> own responsability, but they should allocate one task for each operation
> even if they are manipulating the same framebuffer).

yeah, if we are dynamically allocating the list nodes, that solves the
queuing-up-multiple-times issue..

I wonder if drm_flip_work_allocate_task() should use GPF_ATOMIC when
allocating?  I guess maybe it is possible to pre-allocate the task
from non-irq context, and then queue it from irq context.. it makes
the API a bit more complex, but there are only a couple users
currently, so I suppose this should be doable.

BR,
-R

> This is just a suggestion, so don't hesitate to tell me that it doesn't
> match your expectations.
>
> Best Regards,
>
> Boris
>
>  drivers/gpu/drm/drm_flip_work.c | 95 
> ++---
>  include/drm/drm_flip_work.h | 29 +
>  2 files changed, 92 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
> index f9c7fa3..21d5715 100644
> --- a/drivers/gpu/drm/drm_flip_work.c
> +++ b/drivers/gpu/drm/drm_flip_work.c
> @@ -25,6 +25,43 @@
>  #include "drm_flip_work.h"
>
>  /**
> + * drm_flip_work_allocate_task - allocate a flip-work task
> + * @data: data associated to the task
> + *
> + * Allocate a drm_flip_task object and attach private data to it.
> + */
> +struct drm_flip_task *drm_flip_work_allocate_task(void *data)
> +{
> +   struct drm_flip_task *task;
> +
> +   task = kzalloc(sizeof(*task), GFP_KERNEL);
> +   if (task)
> +   task->data = data;
> +
> +   return task;
> +}
> +EXPORT_SYMBOL(drm_flip_work_allocate_task);
> +
> +/**
> + * drm_flip_work_queue_task - queue a specific task
> + * @work: the flip-work
> + * @task: the task to handle
> + *
> + * Queues task, that will later be run (passed back to drm_flip_func_t
> + * func) on a work queue after drm_flip_work_commit() is called.
> + */
> +void drm_flip_work_queue_task(struct drm_flip_work *work,
> + struct drm_flip_task *task)
> +{
> +   unsigned long flags;
> +
> +   spin_lock_irqsave(>lock, flags);
> +   list_add_tail(>node, >queued);
> +   spin_unlock_irqrestore(>lock, flags);
> +}
> +EXPORT_SYMBOL(drm_flip_work_queue_task);
> +
> +/**
>   * drm_flip_work_queue - queue work
>   * @work: the flip-work
>   * @val: the value to queue
> @@ -34,10 +71,14 @@
>   */
>  void drm_flip_work_queue(struct drm_flip_work *work, void *val)
>  {
> -   if (kfifo_put(>fifo, val)) {
> -   atomic_inc(>pending);
> +   struct drm_flip_task *task;
> +
> +   task = kzalloc(sizeof(*task), GFP_KERNEL);
> +   if (task) {
> +   task->data = val;
> +   drm_flip_work_queue_task(work, task);
> } else {
> -   DRM_ERROR("%s fifo full!\n", work->name);
> +   DRM_ERROR("%s could not allocate task!\n", work->name);
> work->func(work, val);
> }
>  }
> @@ -56,9 +97,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
>  void drm_flip_work_commit(struct drm_flip_work *work,
> struct workqueue_struct *wq)
>  {
> -   uint32_t pending = atomic_read(>pending);
> -   atomic_add(pending, >count);
> -   atomic_sub(pending, >pending);
> +   unsigned long flags;
> +
> +   spin_lock_irqsave(>lock, flags);
> +   list_splice_tail(>queued, >commited);
> +   INIT_LIST_HEAD(>queued);
> +   spin_unlock_irqrestore(>lock, flags);
> queue_work(wq, >worker);
>  }
>  EXPORT_SYMBOL(drm_flip_work_commit);
> @@ -66,14 +110,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
>  static void flip_worker(struct work_struct *w)
>  {
> struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
> 

[Nouveau] [PATCH v4 4/6] drm/nouveau: synchronize BOs when required

2014-07-11 Thread Alexandre Courbot
On 07/10/2014 10:04 PM, Daniel Vetter wrote:
> On Tue, Jul 08, 2014 at 05:25:59PM +0900, Alexandre Courbot wrote:
>> On architectures for which access to GPU memory is non-coherent,
>> caches need to be flushed and invalidated explicitly when BO control
>> changes between CPU and GPU.
>>
>> This patch adds buffer synchronization functions which invokes the
>> correct API (PCI or DMA) to ensure synchronization is effective.
>>
>> Based on the TTM DMA cache helper patches by Lucas Stach.
>>
>> Signed-off-by: Lucas Stach 
>> Signed-off-by: Alexandre Courbot 
>> ---
>>   drivers/gpu/drm/nouveau/nouveau_bo.c  | 56 
>> +++
>>   drivers/gpu/drm/nouveau/nouveau_bo.h  |  2 ++
>>   drivers/gpu/drm/nouveau/nouveau_gem.c | 12 
>>   3 files changed, 70 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
>> b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> index 67e9e8e2e2ec..47e4e8886769 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> @@ -402,6 +402,60 @@ nouveau_bo_unmap(struct nouveau_bo *nvbo)
>>  ttm_bo_kunmap(>kmap);
>>   }
>>
>> +void
>> +nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
>> +{
>> +struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
>> +struct nouveau_device *device = nouveau_dev(drm->dev);
>> +struct ttm_dma_tt *ttm_dma = (struct ttm_dma_tt *)nvbo->bo.ttm;
>> +int i;
>> +
>> +if (!ttm_dma)
>> +return;
>> +
>> +if (nv_device_is_cpu_coherent(device) || nvbo->force_coherent)
>> +return;
>
> Is the is_cpu_coherent check really required? On coherent platforms the
> sync_for_foo should be a noop. It's the dma api's job to encapsulate this
> knowledge so that drivers can be blissfully ignorant. The explicit
> is_coherent check makes this a bit leaky. And same comment that underlying
> the bus-specifics dma-mapping functions are identical.

I think you are right, the is_cpu_coherent check should not be needed 
here. I still think we should have separate paths for the PCI/DMA cases 
though, unless you can point me to a source that clearly states that the 
PCI API is deprecated and that DMA should be used instead.


[Nouveau] [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices

2014-07-11 Thread Alexandre Courbot
On 07/10/2014 09:58 PM, Daniel Vetter wrote:
> On Tue, Jul 08, 2014 at 05:25:57PM +0900, Alexandre Courbot wrote:
>> page_to_phys() is not the correct way to obtain the DMA address of a
>> buffer on a non-PCI system. Use the DMA API functions for this, which
>> are portable and will allow us to use other DMA API functions for
>> buffer synchronization.
>>
>> Signed-off-by: Alexandre Courbot 
>> ---
>>   drivers/gpu/drm/nouveau/core/engine/device/base.c | 8 +++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/core/engine/device/base.c 
>> b/drivers/gpu/drm/nouveau/core/engine/device/base.c
>> index 18c8c7245b73..e4e9e64988fe 100644
>> --- a/drivers/gpu/drm/nouveau/core/engine/device/base.c
>> +++ b/drivers/gpu/drm/nouveau/core/engine/device/base.c
>> @@ -489,7 +489,10 @@ nv_device_map_page(struct nouveau_device *device, 
>> struct page *page)
>>  if (pci_dma_mapping_error(device->pdev, ret))
>>  ret = 0;
>>  } else {
>> -ret = page_to_phys(page);
>> +ret = dma_map_page(>platformdev->dev, page, 0,
>> +   PAGE_SIZE, DMA_BIDIRECTIONAL);
>> +if (dma_mapping_error(>platformdev->dev, ret))
>> +ret = 0;
>>  }
>>
>>  return ret;
>> @@ -501,6 +504,9 @@ nv_device_unmap_page(struct nouveau_device *device, 
>> dma_addr_t addr)
>>  if (nv_device_is_pci(device))
>>  pci_unmap_page(device->pdev, addr, PAGE_SIZE,
>> PCI_DMA_BIDIRECTIONAL);
>
> pci_map/unmap alias to dma_unmap/map when called on the underlying struct
> device embedded in pci_device (like for platform drivers). Dunno whether
> it's worth to track a pointer to the struct device directly and always
> call dma_unmap/map.

Isn't it (theoretically) possible to have a platform that does not use 
the DMA API for its PCI implementation and thus requires the pci_* 
functions to be called? I could not find such a case in -next, which 
suggests that all PCI platforms have been converted to the DMA API 
already and that we could indeed refactor this to always use the DMA 
functions.

But at the same time the way we use APIs should not be directed by their 
implementation, but by their intent - and unless the PCI API has been 
deprecated in some way (something I am not aware of), the rule is still 
that you should use it on a PCI device.

>
> Just drive-by comment since I'm interested in how you solve this - i915
> has similar fun with buffer sharing and coherent and non-coherent
> platforms. Although we don't have fun with pci and non-pci based
> platforms.

Yeah, I am not familiar with i915 but it seems like we are on a similar 
boat here (excepted ARM is more constrained as to its memory mappings). 
The strategy in this series is, map buffers used by user-space cached 
and explicitly synchronize them (since the ownership transition from 
user to GPU is always clearly performed by syscalls), and use coherent 
mappings for buffers used by the kernel which are accessed more 
randomly. This has solved all our coherency issues and resulted in the 
best performance so far.



[PATCH 8/8] drm/tilcdc: panel: Add support for enable GPIO

2014-07-11 Thread Ezequiel Garcia
In order to support the "enable GPIO" available in many panel devices,
this commit adds a proper devicetree binding.

By providing an enable GPIO in the devicetree, the driver can now turn
off and on the panel device, and/or the backlight device. Both the
backlight and the GPIO are optional properties.

Signed-off-by: Ezequiel Garcia 
---
 .../devicetree/bindings/drm/tilcdc/panel.txt   |  2 ++
 drivers/gpu/drm/tilcdc/tilcdc_panel.c  | 37 +++---
 2 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/Documentation/devicetree/bindings/drm/tilcdc/panel.txt 
b/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
index 10a06e8..4ab9e23 100644
--- a/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
+++ b/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
@@ -20,6 +20,7 @@ Required properties:

 Optional properties:
 - backlight: phandle of the backlight device attached to the panel
+- enable-gpios: GPIO pin to enable or disable the panel

 Recommended properties:
  - pinctrl-names, pinctrl-0: the pincontrol settings to configure
@@ -33,6 +34,7 @@ Example:
pinctrl-names = "default";
pinctrl-0 = <_lcd3_cape_lcd_pins>;
backlight = <>;
+   enable-gpios = < 19 0>;

panel-info {
ac-bias   = <255>;
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index f2a5b23..7a03158 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,6 +30,7 @@ struct panel_module {
struct tilcdc_panel_info *info;
struct display_timings *timings;
struct backlight_device *backlight;
+   struct gpio_desc *enable_gpio;
 };
 #define to_panel_module(x) container_of(x, struct panel_module, base)

@@ -55,13 +57,17 @@ static void panel_encoder_dpms(struct drm_encoder *encoder, 
int mode)
 {
struct panel_encoder *panel_encoder = to_panel_encoder(encoder);
struct backlight_device *backlight = panel_encoder->mod->backlight;
+   struct gpio_desc *gpio = panel_encoder->mod->enable_gpio;

-   if (!backlight)
-   return;
+   if (backlight) {
+   backlight->props.power = mode == DRM_MODE_DPMS_ON ?
+FB_BLANK_UNBLANK : FB_BLANK_POWERDOWN;
+   backlight_update_status(backlight);
+   }

-   backlight->props.power = mode == DRM_MODE_DPMS_ON
-? FB_BLANK_UNBLANK : FB_BLANK_POWERDOWN;
-   backlight_update_status(backlight);
+   if (gpio)
+   gpiod_set_value_cansleep(gpio,
+mode == DRM_MODE_DPMS_ON ? 1 : 0);
 }

 static bool panel_encoder_mode_fixup(struct drm_encoder *encoder,
@@ -369,6 +375,25 @@ static int panel_probe(struct platform_device *pdev)
dev_info(>dev, "found backlight\n");
}

+   panel_mod->enable_gpio = devm_gpiod_get(>dev, "enable");
+   if (IS_ERR(panel_mod->enable_gpio)) {
+   ret = PTR_ERR(panel_mod->enable_gpio);
+   if (ret != -ENOENT) {
+   dev_err(>dev, "failed to request enable GPIO\n");
+   goto fail_backlight;
+   }
+
+   /* Optional GPIO is not here, continue silently. */
+   panel_mod->enable_gpio = NULL;
+   } else {
+   ret = gpiod_direction_output(panel_mod->enable_gpio, 0);
+   if (ret < 0) {
+   dev_err(>dev, "failed to setup GPIO\n");
+   goto fail_backlight;
+   }
+   dev_info(>dev, "found enable GPIO\n");
+   }
+
mod = _mod->base;
pdev->dev.platform_data = mod;

@@ -401,6 +426,8 @@ fail_timings:

 fail_free:
tilcdc_module_cleanup(mod);
+
+fail_backlight:
if (panel_mod->backlight)
put_device(_mod->backlight->dev);
return ret;
-- 
1.9.1



[PATCH 7/8] drm/tilcdc: panel: Set return value explicitly

2014-07-11 Thread Ezequiel Garcia
Instead of setting an initial value for the return code, set it explicitly
on each error path. This is just a cosmetic cleanup, as preparation for the
enable GPIO support.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index 3dcf08e..f2a5b23 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -346,7 +346,7 @@ static int panel_probe(struct platform_device *pdev)
struct panel_module *panel_mod;
struct tilcdc_module *mod;
struct pinctrl *pinctrl;
-   int ret = -EINVAL;
+   int ret;

/* bail out early if no DT data: */
if (!node) {
@@ -381,12 +381,14 @@ static int panel_probe(struct platform_device *pdev)
panel_mod->timings = of_get_display_timings(node);
if (!panel_mod->timings) {
dev_err(>dev, "could not get panel timings\n");
+   ret = -EINVAL;
goto fail_free;
}

panel_mod->info = of_get_panel_info(node);
if (!panel_mod->info) {
dev_err(>dev, "could not get panel info\n");
+   ret = -EINVAL;
goto fail_timings;
}

-- 
1.9.1



[PATCH 6/8] drm/tilcdc: panel: Fix backlight devicetree support

2014-07-11 Thread Ezequiel Garcia
The current backlight support is broken; the driver expects a backlight-class
in the panel devicetree node. Fix this by implementing it properly, getting
an optional backlight from a phandle.

This shouldn't cause any backward-compatibility DT issue because the current
implementation doesn't work and is not even documented.

Signed-off-by: Ezequiel Garcia 
---
 .../devicetree/bindings/drm/tilcdc/panel.txt   |  5 +
 drivers/gpu/drm/tilcdc/tilcdc_panel.c  | 23 +-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/Documentation/devicetree/bindings/drm/tilcdc/panel.txt 
b/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
index 9301c33..10a06e8 100644
--- a/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
+++ b/Documentation/devicetree/bindings/drm/tilcdc/panel.txt
@@ -18,6 +18,9 @@ Required properties:
Documentation/devicetree/bindings/video/display-timing.txt for display
timing binding details.

+Optional properties:
+- backlight: phandle of the backlight device attached to the panel
+
 Recommended properties:
  - pinctrl-names, pinctrl-0: the pincontrol settings to configure
muxing properly for pins that connect to TFP410 device
@@ -29,6 +32,8 @@ Example:
compatible = "ti,tilcdc,panel";
pinctrl-names = "default";
pinctrl-0 = <_lcd3_cape_lcd_pins>;
+   backlight = <>;
+
panel-info {
ac-bias   = <255>;
ac-bias-intrpt= <0>;
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index c716c12..3dcf08e 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -342,7 +342,7 @@ static struct tilcdc_panel_info *of_get_panel_info(struct 
device_node *np)

 static int panel_probe(struct platform_device *pdev)
 {
-   struct device_node *node = pdev->dev.of_node;
+   struct device_node *bl_node, *node = pdev->dev.of_node;
struct panel_module *panel_mod;
struct tilcdc_module *mod;
struct pinctrl *pinctrl;
@@ -358,6 +358,17 @@ static int panel_probe(struct platform_device *pdev)
if (!panel_mod)
return -ENOMEM;

+   bl_node = of_parse_phandle(node, "backlight", 0);
+   if (bl_node) {
+   panel_mod->backlight = of_find_backlight_by_node(bl_node);
+   of_node_put(bl_node);
+
+   if (!panel_mod->backlight)
+   return -EPROBE_DEFER;
+
+   dev_info(>dev, "found backlight\n");
+   }
+
mod = _mod->base;
pdev->dev.platform_data = mod;

@@ -381,10 +392,6 @@ static int panel_probe(struct platform_device *pdev)

mod->preferred_bpp = panel_mod->info->bpp;

-   panel_mod->backlight = of_find_backlight_by_node(node);
-   if (panel_mod->backlight)
-   dev_info(>dev, "found backlight\n");
-
return 0;

 fail_timings:
@@ -392,6 +399,8 @@ fail_timings:

 fail_free:
tilcdc_module_cleanup(mod);
+   if (panel_mod->backlight)
+   put_device(_mod->backlight->dev);
return ret;
 }

@@ -399,6 +408,10 @@ static int panel_remove(struct platform_device *pdev)
 {
struct tilcdc_module *mod = dev_get_platdata(>dev);
struct panel_module *panel_mod = to_panel_module(mod);
+   struct backlight_device *backlight = panel_mod->backlight;
+
+   if (backlight)
+   put_device(>dev);

display_timings_release(panel_mod->timings);

-- 
1.9.1



[PATCH 5/8] drm/tilcdc: panel: Use devm_kzalloc to simplify the error path

2014-07-11 Thread Ezequiel Garcia
Using the managed variant to allocate the resource makes the code simpler
and less error-prone.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index 4b36e68..c716c12 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -354,7 +354,7 @@ static int panel_probe(struct platform_device *pdev)
return -ENXIO;
}

-   panel_mod = kzalloc(sizeof(*panel_mod), GFP_KERNEL);
+   panel_mod = devm_kzalloc(>dev, sizeof(*panel_mod), GFP_KERNEL);
if (!panel_mod)
return -ENOMEM;

@@ -391,7 +391,6 @@ fail_timings:
display_timings_release(panel_mod->timings);

 fail_free:
-   kfree(panel_mod);
tilcdc_module_cleanup(mod);
return ret;
 }
@@ -405,7 +404,6 @@ static int panel_remove(struct platform_device *pdev)

tilcdc_module_cleanup(mod);
kfree(panel_mod->info);
-   kfree(panel_mod);

return 0;
 }
-- 
1.9.1



[PATCH 4/8] drm/tilcdc: panel: Spurious whitespace removal

2014-07-11 Thread Ezequiel Garcia
Just a cosmetic cleanup.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index 8f88bfd..4b36e68 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -348,7 +348,6 @@ static int panel_probe(struct platform_device *pdev)
struct pinctrl *pinctrl;
int ret = -EINVAL;

-
/* bail out early if no DT data: */
if (!node) {
dev_err(>dev, "device-tree data is missing\n");
-- 
1.9.1



[PATCH 3/8] drm/tilcdc: panel: Remove unused variable

2014-07-11 Thread Ezequiel Garcia
Just a trivial cleanup to remove the variable.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index d581c53..8f88bfd 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -340,8 +340,6 @@ static struct tilcdc_panel_info *of_get_panel_info(struct 
device_node *np)
return info;
 }

-static struct of_device_id panel_of_match[];
-
 static int panel_probe(struct platform_device *pdev)
 {
struct device_node *node = pdev->dev.of_node;
-- 
1.9.1



[PATCH 2/8] drm/tilcdc: panel: Add missing of_node_put

2014-07-11 Thread Ezequiel Garcia
This commit adds the missing calls to of_node_put to release the node
that's currently held by the of_get_child_by_name() call in the panel
info parsing code.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c 
b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index 4c7aa1d..d581c53 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -311,6 +311,7 @@ static struct tilcdc_panel_info *of_get_panel_info(struct 
device_node *np)
info = kzalloc(sizeof(*info), GFP_KERNEL);
if (!info) {
pr_err("%s: allocation failed\n", __func__);
+   of_node_put(info_np);
return NULL;
}

@@ -331,8 +332,10 @@ static struct tilcdc_panel_info *of_get_panel_info(struct 
device_node *np)
if (ret) {
pr_err("%s: error reading panel-info properties\n", __func__);
kfree(info);
+   of_node_put(info_np);
return NULL;
}
+   of_node_put(info_np);

return info;
 }
-- 
1.9.1



[PATCH 1/8] drm/tilcdc: Fix the error path in tilcdc_load()

2014-07-11 Thread Ezequiel Garcia
The current error path calls tilcdc_unload() in case of an error to release
the resources. However, this is wrong because not all resources have been
allocated by the time an error occurs in tilcdc_load().

To fix it, this commit adds proper labels to bail out at the different
stages in the load function, and release only the resources actually allocated.

Signed-off-by: Ezequiel Garcia 
---
 drivers/gpu/drm/tilcdc/tilcdc_drv.c | 60 ++---
 1 file changed, 50 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.c 
b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
index 6be623b..000428e 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
@@ -84,6 +84,7 @@ static int modeset_init(struct drm_device *dev)
if ((priv->num_encoders == 0) || (priv->num_connectors == 0)) {
/* oh nos! */
dev_err(dev->dev, "no encoders/connectors found\n");
+   drm_mode_config_cleanup(dev);
return -ENXIO;
}

@@ -172,33 +173,37 @@ static int tilcdc_load(struct drm_device *dev, unsigned 
long flags)
dev->dev_private = priv;

priv->wq = alloc_ordered_workqueue("tilcdc", 0);
+   if (!priv->wq) {
+   ret = -ENOMEM;
+   goto fail_free_priv;
+   }

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res) {
dev_err(dev->dev, "failed to get memory resource\n");
ret = -EINVAL;
-   goto fail;
+   goto fail_free_wq;
}

priv->mmio = ioremap_nocache(res->start, resource_size(res));
if (!priv->mmio) {
dev_err(dev->dev, "failed to ioremap\n");
ret = -ENOMEM;
-   goto fail;
+   goto fail_free_wq;
}

priv->clk = clk_get(dev->dev, "fck");
if (IS_ERR(priv->clk)) {
dev_err(dev->dev, "failed to get functional clock\n");
ret = -ENODEV;
-   goto fail;
+   goto fail_iounmap;
}

priv->disp_clk = clk_get(dev->dev, "dpll_disp_ck");
if (IS_ERR(priv->clk)) {
dev_err(dev->dev, "failed to get display clock\n");
ret = -ENODEV;
-   goto fail;
+   goto fail_put_clk;
}

 #ifdef CONFIG_CPU_FREQ
@@ -208,7 +213,7 @@ static int tilcdc_load(struct drm_device *dev, unsigned 
long flags)
CPUFREQ_TRANSITION_NOTIFIER);
if (ret) {
dev_err(dev->dev, "failed to register cpufreq notifier\n");
-   goto fail;
+   goto fail_put_disp_clk;
}
 #endif

@@ -253,13 +258,13 @@ static int tilcdc_load(struct drm_device *dev, unsigned 
long flags)
ret = modeset_init(dev);
if (ret < 0) {
dev_err(dev->dev, "failed to initialize mode setting\n");
-   goto fail;
+   goto fail_cpufreq_unregister;
}

ret = drm_vblank_init(dev, 1);
if (ret < 0) {
dev_err(dev->dev, "failed to initialize vblank\n");
-   goto fail;
+   goto fail_mode_config_cleanup;
}

pm_runtime_get_sync(dev->dev);
@@ -267,7 +272,7 @@ static int tilcdc_load(struct drm_device *dev, unsigned 
long flags)
pm_runtime_put_sync(dev->dev);
if (ret < 0) {
dev_err(dev->dev, "failed to install IRQ handler\n");
-   goto fail;
+   goto fail_vblank_cleanup;
}

platform_set_drvdata(pdev, dev);
@@ -283,13 +288,48 @@ static int tilcdc_load(struct drm_device *dev, unsigned 
long flags)
priv->fbdev = drm_fbdev_cma_init(dev, bpp,
dev->mode_config.num_crtc,
dev->mode_config.num_connector);
+   if (IS_ERR(priv->fbdev)) {
+   ret = PTR_ERR(priv->fbdev);
+   goto fail_irq_uninstall;
+   }

drm_kms_helper_poll_init(dev);

return 0;

-fail:
-   tilcdc_unload(dev);
+fail_irq_uninstall:
+   pm_runtime_get_sync(dev->dev);
+   drm_irq_uninstall(dev);
+   pm_runtime_put_sync(dev->dev);
+
+fail_vblank_cleanup:
+   drm_vblank_cleanup(dev);
+
+fail_mode_config_cleanup:
+   drm_mode_config_cleanup(dev);
+
+fail_cpufreq_unregister:
+   pm_runtime_disable(dev->dev);
+#ifdef CONFIG_CPU_FREQ
+   cpufreq_unregister_notifier(>freq_transition,
+   CPUFREQ_TRANSITION_NOTIFIER);
+fail_put_disp_clk:
+   clk_put(priv->disp_clk);
+#endif
+
+fail_put_clk:
+   clk_put(priv->clk);
+
+fail_iounmap:
+   iounmap(priv->mmio);
+
+fail_free_wq:
+   flush_workqueue(priv->wq);
+   destroy_workqueue(priv->wq);
+
+fail_free_priv:
+   dev->dev_private = NULL;
+   kfree(priv);
return ret;
 }

-- 
1.9.1



[PATCH 0/8] tilcdc-panel: Backlight and GPIO devicetree support

2014-07-11 Thread Ezequiel Garcia
Hello all,

This patchset adds the required changes to support an optional backlight
and GPIO for the tilcdc panel driver.

There was some code to support a backlight, but it was somewhat broken
and undocumented. I've followed the nice implementation in panel-simple
and added a similar one here.

The enable GPIO is required to turn on and off devices with such capability.
Also here, I've followed panel-simple which looks correct.

In addition to this there are very minor cosmetic cleanups and a larger
error path fix in tilcdc's DRM driver .load error path.

This patchset applies on top of drm-next branch which contains the latest
tilcdc pushed by Guido.

If at all possible, I'd like to get this merged for v3.17. If a pull request
is needed, don't hesitate to ask and I'll prepare one.

Comments and tests welcome!

Ezequiel Garcia (8):
  drm/tilcdc: Fix the error path in tilcdc_load()
  drm/tilcdc: panel: Add missing of_node_put
  drm/tilcdc: panel: Remove unused variable
  drm/tilcdc: panel: Spurious whitespace removal
  drm/tilcdc: panel: Use devm_kzalloc to simplify the error path
  drm/tilcdc: panel: Fix backlight devicetree support
  drm/tilcdc: panel: Set return value explicitly
  drm/tilcdc: panel: Add support for enable GPIO

 .../devicetree/bindings/drm/tilcdc/panel.txt   |  7 ++
 drivers/gpu/drm/tilcdc/tilcdc_drv.c| 60 +++---
 drivers/gpu/drm/tilcdc/tilcdc_panel.c  | 74 +-
 3 files changed, 114 insertions(+), 27 deletions(-)

-- 
1.9.1



[Nouveau] [PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Ben Skeggs
On Thu, Jul 10, 2014 at 5:34 PM, Alexandre Courbot  
wrote:
> This series adds support for reclocking on GK20A. The first two patches touch
> the clock subsystem to allow GK20A to operate, by making the presence of the
> thermal and voltage devices optional, and allowing pstates to be provided
> directly instead of being probed using the BIOS (which Tegra does not have).
Hey Alex,

I mentioned a while back I had some stuff in-progress to make
implementing this a bit cleaner for you, but as you can probably tell,
I didn't get to it yet.  It's likely I won't manage to before the next
merge window either.  So, I'll likely take these patches as-is
(assuming no objections on reviews here) and rebase my stuff on top.

>
> The last patch adds the GK20A clock device. Arguably the clock can be seen as 
> a
> stripped-down version of what is seen on NVE0, however instead of using NVE0
> support has been written from scratch using the ChromeOS kernel as a basis.
> There are several reasons for this:
>
> - The ChromeOS driver uses a lookup table for the P coefficient which I could
>   not find in the NVE0 driver,
Interesting.  Can you give more details on how "PL" works exactly,
we'd been operating on the assumption (mainly inherited from code that
appeared in xf86-video-nv) that it was always a straight division.

> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
>   0x137120 and 0x137140),
> - Calculation of MNP is done differently from what is performed in
>   nva3_pll_calc(), and it might be interesting to compare the two methods,
> - All the same, the programming sequence is done differently in the ChromeOS
>   driver and NVE0 could possibly benefit from it (?)
>
> It would be interesting to try and merge both, but for now I prefer to have 
> the
> two coexisting to ensure proper operation on GK20A and besure I don't break
> dGPU support. :)
It's something we can look to achieving down the road, but won't block
merging the support.

>
> Regarding the first patch, one might argue that I could as well add thermal
> and voltage devices to GK20A. The reason this is not done is because these
> currently depend heavily on the presence of a BIOS, and will require a rework
> similar to that done in patch 2 for clocks. I would like to make sure this
> approach is approved because applying it to other subdevs.
They don't *need* to depend on the BIOS, you can opt for an
implementation that doesn't use the base classes that the rest of the
dGPU implementations do.  But, it's fine to take the approach you've
taken.

Thanks!
Ben.

>
> Alexandre Courbot (3):
>   drm/nouveau/clk: make therm and volt devices optional
>   drm/nouveau/clk: support for non-BIOS pstates
>   drm/gk20a: reclocking support
>
>  drivers/gpu/drm/nouveau/Makefile   |   1 +
>  drivers/gpu/drm/nouveau/core/engine/device/nve0.c  |   1 +
>  .../gpu/drm/nouveau/core/include/subdev/clock.h|   9 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/base.c   |  52 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/gk20a.c  | 670 
> +
>  drivers/gpu/drm/nouveau/core/subdev/clock/nv04.c   |   4 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nv40.c   |   4 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nv50.c   |   2 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c   |   4 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nvaa.c   |   4 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nvc0.c   |   4 +-
>  drivers/gpu/drm/nouveau/core/subdev/clock/nve0.c   |   4 +-
>  12 files changed, 725 insertions(+), 34 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/core/subdev/clock/gk20a.c
>
> --
> 2.0.0
>
> ___
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau


[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Alexandre Courbot
On 07/10/2014 06:43 PM, Peter De Schrijver wrote:
> On Thu, Jul 10, 2014 at 09:34:34AM +0200, Alexandre Courbot wrote:
>> This series adds support for reclocking on GK20A. The first two patches touch
>> the clock subsystem to allow GK20A to operate, by making the presence of the
>> thermal and voltage devices optional, and allowing pstates to be provided
>> directly instead of being probed using the BIOS (which Tegra does not have).
>>
>> The last patch adds the GK20A clock device. Arguably the clock can be seen 
>> as a
>> stripped-down version of what is seen on NVE0, however instead of using NVE0
>> support has been written from scratch using the ChromeOS kernel as a basis.
>> There are several reasons for this:
>>
>> - The ChromeOS driver uses a lookup table for the P coefficient which I could
>>not find in the NVE0 driver,
>> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
>>0x137120 and 0x137140),
>> - Calculation of MNP is done differently from what is performed in
>>nva3_pll_calc(), and it might be interesting to compare the two methods,
>> - All the same, the programming sequence is done differently in the ChromeOS
>>driver and NVE0 could possibly benefit from it (?)
>>
>> It would be interesting to try and merge both, but for now I prefer to have 
>> the
>> two coexisting to ensure proper operation on GK20A and besure I don't break
>> dGPU support. :)
>>
>> Regarding the first patch, one might argue that I could as well add thermal
>> and voltage devices to GK20A. The reason this is not done is because these
>> currently depend heavily on the presence of a BIOS, and will require a rework
>> similar to that done in patch 2 for clocks. I would like to make sure this
>> approach is approved because applying it to other subdevs.
>
> I think this should use CCF so we can use pre and post rate change notifiers
> to hookup vdd_gpu DVS.

Do you mean that we should turn the Nouveau gk20a clock driver into a 
consumer of this CCF clock? I have nothing against this, but note that 
Nouveau can also perform DVS on its own, as the pstates can also contain 
a voltage to be applied to the volt device (not yet implemented in this 
series).

The question then becomes whether we want an additional layer of 
abstraction on these devices and whether the pre/post rate change 
notifiers give us any advantage compared to what Nouveau currently proposes.



[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Alexandre Courbot
On 07/10/2014 06:50 PM, Mikko Perttunen wrote:
> Does GK20A itself have any kind of thermal protection capabilities?
> Upstream SOCTHERM support is not yet available (though I have a driver
> in my tree), so we are thinking of disabling CPU DVFS on boards that
> don't have always-on active cooling for now. Same might be necessary for
> GPU as well.

There is a small thermal driver ( 
https://android.googlesource.com/kernel/tegra/+/b445e5296764d18861a6450f6851f25b9ca59dee/drivers/video/tegra/host/gk20a/therm_gk20a.c
 
) but it doesn't seem to do much. I believe that for Tegra we rely in 
SOCTHERM instead, but maybe Ken could confirm?


[PATCH 38/83] hsa/radeon: Workaround for a bug in amd_iommu

2014-07-11 Thread Joerg Roedel
On Fri, Jul 11, 2014 at 12:53:54AM +0300, Oded Gabbay wrote:
> This patch creates a workaround for a bug in amd_iommu driver, where
> the driver doesn't save all necessary information when going to
> suspend.  The workaround removes a device from the IOMMU device list
> on suspend and register a resumed device in the IOMMU device list.
> 
> Signed-off-by: Oded Gabbay 

Which bug do you workaround here? It needs to be fixed in the AMD IOMMU
driver instead of being wrapped in KFD.


Joerg




[Nouveau] [PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Alexandre Courbot
Hi Ben,

On 07/11/2014 10:07 AM, Ben Skeggs wrote:
> On Thu, Jul 10, 2014 at 5:34 PM, Alexandre Courbot  
> wrote:
>> This series adds support for reclocking on GK20A. The first two patches touch
>> the clock subsystem to allow GK20A to operate, by making the presence of the
>> thermal and voltage devices optional, and allowing pstates to be provided
>> directly instead of being probed using the BIOS (which Tegra does not have).
> Hey Alex,
>
> I mentioned a while back I had some stuff in-progress to make
> implementing this a bit cleaner for you, but as you can probably tell,
> I didn't get to it yet.  It's likely I won't manage to before the next
> merge window either.  So, I'll likely take these patches as-is
> (assuming no objections on reviews here) and rebase my stuff on top.

Thanks. You will probably notice that these patches won't apply to your 
tree and require some tweaking. I will probably end up sending a v2 
anyway, so maybe you should wait for it. If you want to try this version 
anyway, I have fixed-up patches against your tree.

Note that your tree currently won't build against -next because it uses 
drm_sysfs_connector_add/remove which are not available anymore (replaced 
by drm_connector_register/unregister I believe).

Oh and while I'm at it, there seems to be a typo in line 131 of clock.h, 
which should read _nouveau_clock_fini and not _nouveau_clock_init.

>>
>> The last patch adds the GK20A clock device. Arguably the clock can be seen 
>> as a
>> stripped-down version of what is seen on NVE0, however instead of using NVE0
>> support has been written from scratch using the ChromeOS kernel as a basis.
>> There are several reasons for this:
>>
>> - The ChromeOS driver uses a lookup table for the P coefficient which I could
>>not find in the NVE0 driver,
> Interesting.  Can you give more details on how "PL" works exactly,
> we'd been operating on the assumption (mainly inherited from code that
> appeared in xf86-video-nv) that it was always a straight division.

The pl_to_div table in clock/gk20a.c should give the right coefficients, 
but I have seen contradictory information in our docs. Let me ask the 
right people so we can get to the bottom of this.

>
>> - Some registers that NVE0 expects to find are not present on GK20A (e.g.
>>0x137120 and 0x137140),
>> - Calculation of MNP is done differently from what is performed in
>>nva3_pll_calc(), and it might be interesting to compare the two methods,
>> - All the same, the programming sequence is done differently in the ChromeOS
>>driver and NVE0 could possibly benefit from it (?)
>>
>> It would be interesting to try and merge both, but for now I prefer to have 
>> the
>> two coexisting to ensure proper operation on GK20A and besure I don't break
>> dGPU support. :)
> It's something we can look to achieving down the road, but won't block
> merging the support.

Great, thanks!

>
>>
>> Regarding the first patch, one might argue that I could as well add thermal
>> and voltage devices to GK20A. The reason this is not done is because these
>> currently depend heavily on the presence of a BIOS, and will require a rework
>> similar to that done in patch 2 for clocks. I would like to make sure this
>> approach is approved because applying it to other subdevs.
> They don't *need* to depend on the BIOS, you can opt for an
> implementation that doesn't use the base classes that the rest of the
> dGPU implementations do.  But, it's fine to take the approach you've
> taken.

At first I did not use the base classes for the gk20a clock 
implementation, but it resulted in replicating some init code and I was 
concerned that this might be a source of bugs in the future (e.g. clock 
base clock init gets updated but not the gk20a init). So I preferred the 
current approach which keeps everything in the same place.

Since you have no concern with it I will apply the same to volt and 
therm, and we can then get rid of patch 1. Then I should probably send 
you a v2 once the PL thing is cleared.

Cheers,
Alex.


[PATCH 09/83] hsa/radeon: Add code base of hsa driver for AMD's GPUs

2014-07-11 Thread Joe Perches
On Fri, 2014-07-11 at 13:04 -0400, Jerome Glisse wrote:
> On Fri, Jul 11, 2014 at 12:50:09AM +0300, Oded Gabbay wrote:
[]
> > +static long kfd_ioctl(struct file *, unsigned int, unsigned long);
> 
> Nitpick, avoid unsigned int just use unsigned.

I suggest unsigned int is much more common (and better)
than just unsigned.

$ git grep -P '\bunsigned\s+(?!long|int|short|char)' -- "*.[ch]" | wc -l
20778

$ git grep -P "\bunsigned\s+int\b" -- "*.[ch]" | wc -l
98068

> > +static int kfd_open(struct inode *, struct file *);

It's also generally better to use types and names tno
improve how a human reads and understands the code.




[PATCH 2/2] drm/radeon: add user pointer support v4

2014-07-11 Thread Alex Deucher
On Fri, Jul 11, 2014 at 9:56 AM, Christian K?nig
 wrote:
> From: Christian K?nig 
>
> This patch adds an IOCTL for turning a pointer supplied by
> userspace into a buffer object.
>
> It imposes several restrictions upon the memory being mapped:
>
> 1. It must be page aligned (both start/end addresses, i.e ptr and size).
>
> 2. It must be normal system memory, not a pointer into another map of IO
> space (e.g. it must not be a GTT mmapping of another object).
>
> 3. The BO is mapped into GTT, so the maximum amount of memory mapped at
> all times is still the GTT limit.
>
> 4. The BO is only mapped readonly for now, so no write support.
>
> 5. List of backing pages is only acquired once, so they represent a
> snapshot of the first use.
>
> Exporting and sharing as well as mapping of buffer objects created by
> this function is forbidden and results in an -EPERM.
>
> v2: squash all previous changes into first public version
> v3: fix tabs, map readonly, don't use MM callback any more
> v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
> pin/unpin pages on bind/unbind instead of populate/unpopulate
>
> Signed-off-by: Christian K?nig 
> Reviewed-by: Alex Deucher  (v3)
> Reviewed-by: J?r?me Glisse  (v3)

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/radeon/radeon.h|   4 ++
>  drivers/gpu/drm/radeon/radeon_cs.c |  25 +++-
>  drivers/gpu/drm/radeon/radeon_drv.c|   5 +-
>  drivers/gpu/drm/radeon/radeon_gem.c|  67 +++
>  drivers/gpu/drm/radeon/radeon_kms.c|   1 +
>  drivers/gpu/drm/radeon/radeon_object.c |   3 +
>  drivers/gpu/drm/radeon/radeon_prime.c  |  10 +++
>  drivers/gpu/drm/radeon/radeon_ttm.c| 113 
> -
>  drivers/gpu/drm/radeon/radeon_vm.c |   3 +
>  include/uapi/drm/radeon_drm.h  |  11 
>  10 files changed, 238 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 8a190ce..ee55b01 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -2111,6 +2111,8 @@ int radeon_gem_info_ioctl(struct drm_device *dev, void 
> *data,
>   struct drm_file *filp);
>  int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
> struct drm_file *filp);
> +int radeon_gem_import_ioctl(struct drm_device *dev, void *data,
> +   struct drm_file *filp);
>  int radeon_gem_pin_ioctl(struct drm_device *dev, void *data,
>  struct drm_file *file_priv);
>  int radeon_gem_unpin_ioctl(struct drm_device *dev, void *data,
> @@ -2835,6 +2837,8 @@ extern void radeon_legacy_set_clock_gating(struct 
> radeon_device *rdev, int enabl
>  extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int 
> enable);
>  extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 
> domain);
>  extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
> +extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t userptr);
> +extern bool radeon_ttm_tt_has_userptr(struct ttm_tt *ttm);
>  extern void radeon_vram_location(struct radeon_device *rdev, struct 
> radeon_mc *mc, u64 base);
>  extern void radeon_gtt_location(struct radeon_device *rdev, struct radeon_mc 
> *mc);
>  extern int radeon_resume_kms(struct drm_device *dev, bool resume, bool 
> fbcon);
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 71a1434..be65311 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -78,7 +78,8 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
> *p)
> struct radeon_cs_chunk *chunk;
> struct radeon_cs_buckets buckets;
> unsigned i, j;
> -   bool duplicate;
> +   bool duplicate, need_mmap_lock = false;
> +   int r;
>
> if (p->chunk_relocs_idx == -1) {
> return 0;
> @@ -164,6 +165,19 @@ static int radeon_cs_parser_relocs(struct 
> radeon_cs_parser *p)
> p->relocs[i].allowed_domains = domain;
> }
>
> +   if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
> +   uint32_t domain = p->relocs[i].prefered_domains;
> +   if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
> +   DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
> + "allowed for userptr BOs\n");
> +   return -EINVAL;
> +   }
> +   need_mmap_lock = true;
> +   domain = RADEON_GEM_DOMAIN_GTT;
> +   p->relocs[i].prefered_domains = domain;
> +   p->relocs[i].allowed_domains = domain;
> +   }
> +
> p->relocs[i].tv.bo = >relocs[i].robj->tbo;
> p->relocs[i].handle = r->handle;

[PATCH 1/2] drm/radeon: add readonly flag to radeon_gart_set_page v3

2014-07-11 Thread Alex Deucher
On Fri, Jul 11, 2014 at 9:56 AM, Christian K?nig
 wrote:
> From: Christian K?nig 
>
> v2: use flag instead of boolean
> v3: keep R600_PTE_GART as it is
>
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/r100.c|  2 +-
>  drivers/gpu/drm/radeon/r300.c|  8 ++--
>  drivers/gpu/drm/radeon/radeon.h  | 10 ++
>  drivers/gpu/drm/radeon/radeon_asic.h |  8 
>  drivers/gpu/drm/radeon/radeon_gart.c |  9 +
>  drivers/gpu/drm/radeon/radeon_ttm.c  |  4 ++--
>  drivers/gpu/drm/radeon/rs400.c   |  9 +++--
>  drivers/gpu/drm/radeon/rs600.c   |  8 ++--
>  8 files changed, 37 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
> index e32abf3..f58b5d1 100644
> --- a/drivers/gpu/drm/radeon/r100.c
> +++ b/drivers/gpu/drm/radeon/r100.c
> @@ -650,7 +650,7 @@ void r100_pci_gart_disable(struct radeon_device *rdev)
>  }
>
>  void r100_pci_gart_set_page(struct radeon_device *rdev, unsigned i,
> -   uint64_t addr)
> +   uint64_t addr, uint32_t flags)
>  {
> u32 *gtt = rdev->gart.ptr;
> gtt[i] = cpu_to_le32(lower_32_bits(addr));
> diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
> index 8d14e66..b947f42 100644
> --- a/drivers/gpu/drm/radeon/r300.c
> +++ b/drivers/gpu/drm/radeon/r300.c
> @@ -73,13 +73,17 @@ void rv370_pcie_gart_tlb_flush(struct radeon_device *rdev)
>  #define R300_PTE_READABLE  (1 << 3)
>
>  void rv370_pcie_gart_set_page(struct radeon_device *rdev, unsigned i,
> - uint64_t addr)
> + uint64_t addr, uint32_t flags)
>  {
> void __iomem *ptr = rdev->gart.ptr;
>
> addr = (lower_32_bits(addr) >> 8) |
>((upper_32_bits(addr) & 0xff) << 24) |
> -  R300_PTE_WRITEABLE | R300_PTE_READABLE;
> +  R300_PTE_READABLE;
> +
> +   if (!(flags & RADEON_GART_PAGE_READONLY))
> +   addr |= R300_PTE_WRITEABLE;
> +
> /* on x86 we want this to be CPU endian, on powerpc
>  * on powerpc without HW swappers, it'll get swapped on way
>  * into VRAM - so no need for cpu_to_le32 on VRAM tables */
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 7cda75d..8a190ce 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -592,6 +592,8 @@ struct radeon_mc;
>  #define RADEON_GPU_PAGE_SHIFT 12
>  #define RADEON_GPU_PAGE_ALIGN(a) (((a) + RADEON_GPU_PAGE_MASK) & 
> ~RADEON_GPU_PAGE_MASK)
>
> +#define RADEON_GART_PAGE_READONLY  1

This is fine for now, but once we add more flags we should change them
to reflect the usage.  E.g.,
RADEON_GART_PAGE_READ
RADEON_GART_PAGE_WRITE
RADEON_GART_PAGE_SNOOP
and set them explicitly.

Alex

> +
>  struct radeon_gart {
> dma_addr_t  table_addr;
> struct radeon_bo*robj;
> @@ -616,7 +618,7 @@ void radeon_gart_unbind(struct radeon_device *rdev, 
> unsigned offset,
> int pages);
>  int radeon_gart_bind(struct radeon_device *rdev, unsigned offset,
>  int pages, struct page **pagelist,
> -dma_addr_t *dma_addr);
> +dma_addr_t *dma_addr, uint32_t flags);
>
>
>  /*
> @@ -855,7 +857,7 @@ struct radeon_mec {
>
>  /* flags used for GART page table entries on R600+ */
>  #define R600_PTE_GART  ( R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED 
> \
> -   | R600_PTE_READABLE | R600_PTE_WRITEABLE)
> +   | R600_PTE_READABLE | R600_PTE_WRITEABLE )
>
>  struct radeon_vm_pt {
> struct radeon_bo*bo;
> @@ -1775,7 +1777,7 @@ struct radeon_asic {
> struct {
> void (*tlb_flush)(struct radeon_device *rdev);
> void (*set_page)(struct radeon_device *rdev, unsigned i,
> -uint64_t addr);
> +uint64_t addr, uint32_t flags);
> } gart;
> struct {
> int (*init)(struct radeon_device *rdev);
> @@ -2735,7 +2737,7 @@ void radeon_ring_write(struct radeon_ring *ring, 
> uint32_t v);
>  #define radeon_vga_set_state(rdev, state) 
> (rdev)->asic->vga_set_state((rdev), (state))
>  #define radeon_asic_reset(rdev) (rdev)->asic->asic_reset((rdev))
>  #define radeon_gart_tlb_flush(rdev) (rdev)->asic->gart.tlb_flush((rdev))
> -#define radeon_gart_set_page(rdev, i, p) (rdev)->asic->gart.set_page((rdev), 
> (i), (p))
> +#define radeon_gart_set_page(rdev, i, p, r) 
> (rdev)->asic->gart.set_page((rdev), (i), (p), (r))
>  #define radeon_asic_vm_init(rdev) (rdev)->asic->vm.init((rdev))
>  #define radeon_asic_vm_fini(rdev) (rdev)->asic->vm.fini((rdev))
>  #define radeon_asic_vm_set_page(rdev, ib, pe, addr, count, incr, flags) 
> ((rdev)->asic->vm.set_page((rdev), (ib), (pe), (addr), (count), 

[PATCH 0/3] drm/gk20a: support for reclocking

2014-07-11 Thread Martin Peres
On 11/07/2014 03:42, Alexandre Courbot wrote:
> On 07/10/2014 06:50 PM, Mikko Perttunen wrote:
>> Does GK20A itself have any kind of thermal protection capabilities?
>> Upstream SOCTHERM support is not yet available (though I have a driver
>> in my tree), so we are thinking of disabling CPU DVFS on boards that
>> don't have always-on active cooling for now. Same might be necessary for
>> GPU as well.
>
> There is a small thermal driver ( 
> https://android.googlesource.com/kernel/tegra/+/b445e5296764d18861a6450f6851f25b9ca59dee/drivers/video/tegra/host/gk20a/therm_gk20a.c
>  
> ) but it doesn't seem to do much. I believe that for Tegra we rely in 
> SOCTHERM instead, but maybe Ken could confirm?

Unless it changed a great deal, I reverse engineered most of the 
registers in this area (for the nv50), but some stuff didn't change that 
much and could be used straight away (temperature reading, sending IRQs 
on thresholds, auto downclocking when overheating). We are not really 
using those features on nouveau right now, we just poll on the 
temperature every second.

If we are moving to using the PMU for thermal monitoring, we can do the 
polling there and warn the userspace when the temperature is too high or 
if performance is insufficient / too much. I have part of the code for 
performance counters in PMU, it's dead simple.

Martin


[Nouveau] [PATCH v4 4/6] drm/nouveau: synchronize BOs when required

2014-07-11 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 11:40:27AM +0900, Alexandre Courbot wrote:
> On 07/10/2014 10:04 PM, Daniel Vetter wrote:
> >On Tue, Jul 08, 2014 at 05:25:59PM +0900, Alexandre Courbot wrote:
> >>On architectures for which access to GPU memory is non-coherent,
> >>caches need to be flushed and invalidated explicitly when BO control
> >>changes between CPU and GPU.
> >>
> >>This patch adds buffer synchronization functions which invokes the
> >>correct API (PCI or DMA) to ensure synchronization is effective.
> >>
> >>Based on the TTM DMA cache helper patches by Lucas Stach.
> >>
> >>Signed-off-by: Lucas Stach 
> >>Signed-off-by: Alexandre Courbot 
> >>---
> >>  drivers/gpu/drm/nouveau/nouveau_bo.c  | 56 
> >> +++
> >>  drivers/gpu/drm/nouveau/nouveau_bo.h  |  2 ++
> >>  drivers/gpu/drm/nouveau/nouveau_gem.c | 12 
> >>  3 files changed, 70 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> >>b/drivers/gpu/drm/nouveau/nouveau_bo.c
> >>index 67e9e8e2e2ec..47e4e8886769 100644
> >>--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> >>+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> >>@@ -402,6 +402,60 @@ nouveau_bo_unmap(struct nouveau_bo *nvbo)
> >>ttm_bo_kunmap(>kmap);
> >>  }
> >>
> >>+void
> >>+nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
> >>+{
> >>+   struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
> >>+   struct nouveau_device *device = nouveau_dev(drm->dev);
> >>+   struct ttm_dma_tt *ttm_dma = (struct ttm_dma_tt *)nvbo->bo.ttm;
> >>+   int i;
> >>+
> >>+   if (!ttm_dma)
> >>+   return;
> >>+
> >>+   if (nv_device_is_cpu_coherent(device) || nvbo->force_coherent)
> >>+   return;
> >
> >Is the is_cpu_coherent check really required? On coherent platforms the
> >sync_for_foo should be a noop. It's the dma api's job to encapsulate this
> >knowledge so that drivers can be blissfully ignorant. The explicit
> >is_coherent check makes this a bit leaky. And same comment that underlying
> >the bus-specifics dma-mapping functions are identical.
> 
> I think you are right, the is_cpu_coherent check should not be needed here.
> I still think we should have separate paths for the PCI/DMA cases though,
> unless you can point me to a source that clearly states that the PCI API is
> deprecated and that DMA should be used instead.

Ah, on 2nd look I've found it again. Quoting
Documentation/DMA-API-HOWTO.txt:

"Note that the DMA API works with any bus independent of the underlying
microprocessor architecture. You should use the DMA API rather than the
bus-specific DMA API, i.e., use the dma_map_*() interfaces rather than the
pci_map_*() interfaces."

The advice is fairly strong here I think ;-) And imo the idea makes sense,
since it allows drivers like nouveau here to care much less about the
actual bus used to get data to/from the ip block. And if you look at intel
gfx it makes even more sense since the pci layer we have is really just a
thin fake shim whacked on top of the hw (on SoCs at least).

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Nouveau] [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices

2014-07-11 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 11:35:23AM +0900, Alexandre Courbot wrote:
> On 07/10/2014 09:58 PM, Daniel Vetter wrote:
> >On Tue, Jul 08, 2014 at 05:25:57PM +0900, Alexandre Courbot wrote:
> >>page_to_phys() is not the correct way to obtain the DMA address of a
> >>buffer on a non-PCI system. Use the DMA API functions for this, which
> >>are portable and will allow us to use other DMA API functions for
> >>buffer synchronization.
> >>
> >>Signed-off-by: Alexandre Courbot 
> >>---
> >>  drivers/gpu/drm/nouveau/core/engine/device/base.c | 8 +++-
> >>  1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/drivers/gpu/drm/nouveau/core/engine/device/base.c 
> >>b/drivers/gpu/drm/nouveau/core/engine/device/base.c
> >>index 18c8c7245b73..e4e9e64988fe 100644
> >>--- a/drivers/gpu/drm/nouveau/core/engine/device/base.c
> >>+++ b/drivers/gpu/drm/nouveau/core/engine/device/base.c
> >>@@ -489,7 +489,10 @@ nv_device_map_page(struct nouveau_device *device, 
> >>struct page *page)
> >>if (pci_dma_mapping_error(device->pdev, ret))
> >>ret = 0;
> >>} else {
> >>-   ret = page_to_phys(page);
> >>+   ret = dma_map_page(>platformdev->dev, page, 0,
> >>+  PAGE_SIZE, DMA_BIDIRECTIONAL);
> >>+   if (dma_mapping_error(>platformdev->dev, ret))
> >>+   ret = 0;
> >>}
> >>
> >>return ret;
> >>@@ -501,6 +504,9 @@ nv_device_unmap_page(struct nouveau_device *device, 
> >>dma_addr_t addr)
> >>if (nv_device_is_pci(device))
> >>pci_unmap_page(device->pdev, addr, PAGE_SIZE,
> >>   PCI_DMA_BIDIRECTIONAL);
> >
> >pci_map/unmap alias to dma_unmap/map when called on the underlying struct
> >device embedded in pci_device (like for platform drivers). Dunno whether
> >it's worth to track a pointer to the struct device directly and always
> >call dma_unmap/map.
> 
> Isn't it (theoretically) possible to have a platform that does not use the
> DMA API for its PCI implementation and thus requires the pci_* functions to
> be called? I could not find such a case in -next, which suggests that all
> PCI platforms have been converted to the DMA API already and that we could
> indeed refactor this to always use the DMA functions.
> 
> But at the same time the way we use APIs should not be directed by their
> implementation, but by their intent - and unless the PCI API has been
> deprecated in some way (something I am not aware of), the rule is still that
> you should use it on a PCI device.

Hm, somehow I've thought that it's recommended to just use the dma api
directly. It's what we're doing in i915 at least, but now I'm not so sure
any more. My guess is that this is just history really when the dma api
was pci-only.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Intel-gfx] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-11 Thread Daniel Vetter
On Thu, Jul 10, 2014 at 09:08:24PM +, Tian, Kevin wrote:
> actually I'm curious whether it's still necessary to __detect__ PCH. Could
> we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> code the knowledge:
> 
>   } else if (IS_BROADWELL(dev)) {
>   dev_priv->pch_type = PCH_LPT;
>   dev_priv->pch_id =
>   INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
>   DRM_DEBUG_KMS("This is Broadwell, assuming "
> "LynxPoint LP PCH\n");
> 
> Or if there is real usage on non-fixed mapping (not majority), could it be a 
> better option to have fixed mapping as a fallback instead of leaving as 
> PCH_NONE? Then even when Qemu doesn't provide a special tweaked PCH,
> the majority case just works.

I guess we can do it, at least I haven't seen any strange combinations in
the wild outside of Intel ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[PATCH] modesetting: Support native primary plane rotation

2014-07-11 Thread Chris Wilson
With the advent of universal drm planes and the introduction of generic
plane properties for rotations, we can query and program the hardware
for native rotation support.

NOTE: this depends upon the next release of libdrm to remove one
opencoded define.

v2: Use enum to determine primary plane, suggested by Pekka Paalanen.
Use libobj for replacement ffs(), suggested by walter harms
v3: Support combinations of rotations and reflections
Eliminate use of ffs() and so remove need for libobj
v4: And remove the vestigal HAVE_FFS, spotted by Mark Kettenis

Signed-off-by: Chris Wilson 
Cc: Pekka Paalanen 
Cc: walter harms 
Cc: Mark Kettenis 
---
 configure.ac  |   2 +-
 src/drmmode_display.c | 258 --
 src/drmmode_display.h |   7 +-
 3 files changed, 234 insertions(+), 33 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1c1a36d..783c243 100644
--- a/configure.ac
+++ b/configure.ac
@@ -74,7 +74,7 @@ AM_CONDITIONAL(HAVE_XEXTPROTO_71, [ test "$HAVE_XEXTPROTO_71" 
= "yes" ])
 # Checks for header files.
 AC_HEADER_STDC

-PKG_CHECK_MODULES(DRM, [libdrm >= 2.4.46])
+PKG_CHECK_MODULES(DRM, [libdrm >= 2.4.47])
 PKG_CHECK_MODULES([PCIACCESS], [pciaccess >= 0.10])
 AM_CONDITIONAL(DRM, test "x$DRM" = xyes)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index c533324..93c48ac 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -56,6 +56,8 @@

 #include "driver.h"

+#define DRM_CLIENT_CAP_UNIVERSAL_PLANES 2 /* from libdrm 2.4.55 */
+
 static struct dumb_bo *dumb_bo_create(int fd,
  const unsigned width, const unsigned height,
  const unsigned bpp)
@@ -300,6 +302,171 @@ create_pixmap_for_fbcon(drmmode_ptr drmmode,

 #endif

+static void
+rotation_init(xf86CrtcPtr crtc)
+{
+   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
+   drmmode_ptr drmmode = drmmode_crtc->drmmode;
+   drmModePlaneRes *plane_resources;
+   int i, j, k;
+
+   drmSetClientCap(drmmode->fd, DRM_CLIENT_CAP_UNIVERSAL_PLANES, 1);
+
+   plane_resources = drmModeGetPlaneResources(drmmode->fd);
+   if (plane_resources == NULL)
+   return;
+
+   for (i = 0; drmmode_crtc->primary_plane_id == 0 && i < 
plane_resources->count_planes; i++) {
+   drmModePlane *drm_plane;
+   drmModeObjectPropertiesPtr proplist;
+   int is_primary = -1;
+
+   drm_plane = drmModeGetPlane(drmmode->fd,
+   plane_resources->planes[i]);
+   if (drm_plane == NULL)
+   continue;
+
+   if (!(drm_plane->possible_crtcs & (1 << drmmode_crtc->index)))
+   goto free_plane;
+
+   proplist = drmModeObjectGetProperties(drmmode->fd,
+ drm_plane->plane_id,
+ DRM_MODE_OBJECT_PLANE);
+   if (proplist == NULL)
+   goto free_plane;
+
+   for (j = 0; is_primary == -1 && j < proplist->count_props; j++) 
{
+   drmModePropertyPtr prop;
+
+   prop = drmModeGetProperty(drmmode->fd, 
proplist->props[j]);
+   if (!prop)
+   continue;
+
+   if (strcmp(prop->name, "type") == 0) {
+   for (k = 0; k < prop->count_enums; k++) {
+   if (prop->enums[k].value != 
proplist->prop_values[j])
+   continue;
+
+   is_primary = 
strcmp(prop->enums[k].name, "Primary") == 0;
+   break;
+   }
+   }
+
+   drmModeFreeProperty(prop);
+   }
+
+   if (is_primary) {
+   drmmode_crtc->primary_plane_id = drm_plane->plane_id;
+
+   for (j = 0; drmmode_crtc->rotation_prop_id == 0 && j < 
proplist->count_props; j++) {
+   drmModePropertyPtr prop;
+
+   prop = drmModeGetProperty(drmmode->fd, 
proplist->props[j]);
+   if (!prop)
+   continue;
+
+   if (strcmp(prop->name, "rotation") == 0) {
+   drmmode_crtc->rotation_prop_id = 
proplist->props[j];
+   drmmode_crtc->current_rotation = 
proplist->prop_values[j];
+   for (k = 0; k < prop->count_enums; k++) 
{
+   int rr = -1;
+   if (strcmp(prop->enums[k].name, 
"rotate-0") == 0)
+   rr = 

[git pull] drm driver fixes

2014-07-11 Thread Dave Airlie

Hi Linus,

Nothing too scary, we have one outstanding i915 regression but Daniel has 
promised the fix as soon as he's finished testing it a bit.

Fixes for the main x86 drivers:
radeon: dpm fixes, displayport regression fix
i915: quirks for backlight regression, edp reboot fix, valleyview black 
screen fixes
nouveau: display port regression fixes, fix for memory reclocking.

Dave.

The following changes since commit 4f440cd534359f689cb577c68f8491d1eddf0b76:

  Merge tag 'pci-v3.16-fixes-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci (2014-07-09 16:18:18 
-0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to bf38b025d3f58f4c1273714ff1be5bfbf99574a4:

  Merge branch 'drm-fixes-3.16' of git://people.freedesktop.org/~agd5f/linux 
into drm-fixes (2014-07-11 11:24:13 +1000)



Alex Deucher (3):
  drm/radeon/dp: return -EIO for flags not zero case
  drm/radeon: fix typo in golden register setup on evergreen
  drm/radeon: fix typo in ci_stop_dpm()

Alexandre Demers (1):
  drm/radeon/dpm: Reenabling SS on Cayman

Ben Skeggs (6):
  drm/gk104/ram: bash mpll bit 31 on
  drm/nv50-/kms: pass a non-zero value for head to sor dpms methods
  drm/nouveau/kms: restore fbcon after display has been resumed
  drm/nouveau/dp: fix required link bandwidth calculations
  drm/nouveau/dp: workaround broken display
  drm/nouveau/ram: fix test for gpio presence

Christian K?nig (1):
  drm/radeon: only print meaningful VM faults

Clint Taylor (1):
  drm/i915/vlv: T12 eDP panel timing enforcement during reboot

Daniel Vetter (1):
  drm/i915: Only unbind vgacon, not other console drivers

Dave Airlie (3):
  Merge branch 'drm-nouveau-next' of 
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
  Merge tag 'drm-intel-fixes-2014-07-09' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes
  Merge branch 'drm-fixes-3.16' of 
git://people.freedesktop.org/~agd5f/linux into drm-fixes

Scot Doyle (3):
  drm/i915: quirk asserts controllable backlight presence, overriding VBT
  drm/i915: Acer C720 and C720P have controllable backlights
  drm/i915: Toshiba CB35 has a controllable backlight

Shobhit Kumar (2):
  drm/i915/vlv: DPI FIFO empty check is not needed
  drm/i915/vlv: Update the DSI ULPS entry/exit sequence

St?phane Marchesin (1):
  drm/nouveau/fb: Prevent inlining of ramfuc_reg

Ville Syrj?l? (1):
  drm/i915: Don't clobber the GTT when it's within stolen memory

 drivers/gpu/drm/i915/i915_dma.c|  5 ++-
 drivers/gpu/drm/i915/i915_drv.h|  1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c | 44 ++
 drivers/gpu/drm/i915/i915_reg.h|  3 ++
 drivers/gpu/drm/i915/intel_display.c   | 14 +++
 drivers/gpu/drm/i915/intel_dp.c| 42 +
 drivers/gpu/drm/i915/intel_drv.h   |  2 +
 drivers/gpu/drm/i915/intel_dsi.c   | 29 +++---
 drivers/gpu/drm/i915/intel_dsi_cmd.c   |  6 ---
 drivers/gpu/drm/i915/intel_panel.c |  8 +++-
 drivers/gpu/drm/nouveau/core/engine/disp/nv50.c|  6 +--
 drivers/gpu/drm/nouveau/core/engine/disp/nvd0.c|  6 +--
 drivers/gpu/drm/nouveau/core/engine/disp/outpdp.c  |  8 ++--
 drivers/gpu/drm/nouveau/core/engine/disp/sornv50.c |  1 +
 drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h|  4 +-
 drivers/gpu/drm/nouveau/core/subdev/fb/ramnve0.c   |  1 +
 drivers/gpu/drm/nouveau/nouveau_drm.c  | 17 +
 drivers/gpu/drm/nouveau/nouveau_fbcon.c| 13 ++-
 drivers/gpu/drm/nouveau/nouveau_fbcon.h|  1 -
 drivers/gpu/drm/nouveau/nv50_display.c |  3 +-
 drivers/gpu/drm/radeon/atombios_dp.c   |  2 +-
 drivers/gpu/drm/radeon/ci_dpm.c|  2 +-
 drivers/gpu/drm/radeon/cik.c   |  6 ++-
 drivers/gpu/drm/radeon/evergreen.c | 14 ---
 drivers/gpu/drm/radeon/rv770_dpm.c |  6 ---
 drivers/gpu/drm/radeon/si.c|  6 ++-
 26 files changed, 177 insertions(+), 73 deletions(-)


[PATCH 83/83] hsa/radeon: Update module version to 0.6.2

2014-07-11 Thread Oded Gabbay
This version is intended for upstreaming to the Linux kernel 3.17

Signed-off-by: Oded Gabbay 
---
 drivers/gpu/hsa/radeon/kfd_module.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/hsa/radeon/kfd_module.c 
b/drivers/gpu/hsa/radeon/kfd_module.c
index c706236..c783eeb 100644
--- a/drivers/gpu/hsa/radeon/kfd_module.c
+++ b/drivers/gpu/hsa/radeon/kfd_module.c
@@ -30,10 +30,10 @@
 #define KFD_DRIVER_AUTHOR  "AMD Inc. and others"

 #define KFD_DRIVER_DESC"Standalone HSA driver for AMD's GPUs"
-#define KFD_DRIVER_DATE"20140623"
+#define KFD_DRIVER_DATE"20140710"
 #define KFD_DRIVER_MAJOR   0
 #define KFD_DRIVER_MINOR   6
-#define KFD_DRIVER_PATCHLEVEL  1
+#define KFD_DRIVER_PATCHLEVEL  2

 const struct kfd2kgd_calls *kfd2kgd;
 static const struct kgd2kfd_calls kgd2kfd = {
-- 
1.9.1



[PATCH 82/83] drm/radeon: Remove lock functions from kfd2kgd interface

2014-07-11 Thread Oded Gabbay
From: Ben Goz 

Signed-off-by: Ben Goz 
Signed-off-by: Oded Gabbay 
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 44 -
 include/linux/radeon_kfd.h  | 10 -
 2 files changed, 54 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c 
b/drivers/gpu/drm/radeon/radeon_kfd.c
index 738c2b3..7e8e041 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -115,12 +115,6 @@ static void unkmap_mem(struct kgd_dev *kgd, struct kgd_mem 
*mem);
 static uint64_t get_vmem_size(struct kgd_dev *kgd);
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);

-static void lock_srbm_gfx_cntl(struct kgd_dev *kgd);
-static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
-
-static void lock_grbm_gfx_idx(struct kgd_dev *kgd);
-static void unlock_grbm_gfx_idx(struct kgd_dev *kgd);
-
 static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);

 /*
@@ -146,10 +140,6 @@ static const struct kfd2kgd_calls kfd2kgd = {
.unkmap_mem = unkmap_mem,
.get_vmem_size = get_vmem_size,
.get_gpu_clock_counter = get_gpu_clock_counter,
-   .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
-   .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
-   .lock_grbm_gfx_idx = lock_grbm_gfx_idx,
-   .unlock_grbm_gfx_idx = unlock_grbm_gfx_idx,
.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
.program_sh_mem_settings = kgd_program_sh_mem_settings,
.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
@@ -200,8 +190,6 @@ void radeon_kfd_device_init(struct radeon_device *rdev)
 {
if (rdev->kfd) {
struct kgd2kfd_shared_resources gpu_resources = {
-   .mmio_registers = rdev->rmmio,
-
.compute_vmid_bitmap = 0xFF00,

.first_compute_pipe = 1,
@@ -363,38 +351,6 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd)
return rdev->mc.real_vram_size;
 }

-static void lock_srbm_gfx_cntl(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   mutex_lock(>srbm_mutex);
-}
-
-static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   mutex_unlock(>srbm_mutex);
-}
-
-static void lock_grbm_gfx_idx(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   BUG_ON(kgd == NULL);
-
-   mutex_lock(>grbm_idx_mutex);
-}
-
-static void unlock_grbm_gfx_idx(struct kgd_dev *kgd)
-{
-   struct radeon_device *rdev = (struct radeon_device *)kgd;
-
-   BUG_ON(kgd == NULL);
-
-   mutex_unlock(>grbm_idx_mutex);
-}
-
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd)
 {
struct radeon_device *rdev = (struct radeon_device *)kgd;
diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
index aa021fb..2fffe32 100644
--- a/include/linux/radeon_kfd.h
+++ b/include/linux/radeon_kfd.h
@@ -45,8 +45,6 @@ enum kgd_memory_pool {
 };

 struct kgd2kfd_shared_resources {
-   void __iomem *mmio_registers; /* Mapped pointer to GFX MMIO registers. 
*/
-
unsigned int compute_vmid_bitmap; /* Bit n == 1 means VMID n is 
available for KFD. */

unsigned int first_compute_pipe; /* Compute pipes are counted starting 
from MEC0/pipe0 as 0. */
@@ -86,14 +84,6 @@ struct kfd2kgd_calls {
uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
uint64_t (*get_gpu_clock_counter)(struct kgd_dev *kgd);

-   /* SRBM_GFX_CNTL mutex */
-   void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
-   void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
-
-   /* GRBM_GFX_INDEX mutex */
-   void (*lock_grbm_gfx_idx)(struct kgd_dev *kgd);
-   void (*unlock_grbm_gfx_idx)(struct kgd_dev *kgd);
-
uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);

/* Register access functions */
-- 
1.9.1



  1   2   >