Re: [Freedreno] [PATCH 1/4] drm/msm/a6xx: Remove unwanted regulator code

2018-11-20 Thread Stephen Boyd
Quoting Jordan Crouse (2018-11-19 15:47:03)
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 546599a7ab05..51493f409358 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -646,9 +646,6 @@ int a6xx_gmu_reset(struct a6xx_gpu *a6xx_gpu)
> gmu_poll_timeout(gmu, REG_A6XX_RSCC_TCS3_DRV0_STATUS, val,
> (val & 1), 100, 1000);
>  
> -   /* Force off the GX GSDC */
> -   regulator_force_disable(gmu->gx);
> -

I was going to ask if the regulator header file was included, but it
isn't, so nothing to fix!

> /* Disable the resources */
> clk_bulk_disable_unprepare(gmu->nr_clocks, gmu->clocks);
> pm_runtime_put_sync(gmu->dev);
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 4/4] drm/msm/gpu: Attach to the GPU GX power domain

2018-11-20 Thread Stephen Boyd
Quoting Jordan Crouse (2018-11-19 15:47:06)
> 99.999% of the time during normal operation the GMU is responsible
> for power and clock control on the GX domain and the CPU remains
> blissfully unaware. However, there is one situation where the CPU
> needs to get involved:
> 
> The power sequencing rules dictate that the GX needs to be turned
> off before the CX so that the CX can be turned on before the GX
> during power up. During normal operation when the CPU is taking
> down the CX domain a stop command is sent to the GMU which turns
> off the GX domain and then the CPU handles the CX domain.
> 
> But if the GMU happened to be unresponsive while the GX domain was
> left then the CPU will need to step in and turn off the GX domain

  ^ left on?

> before resetting the CX and rebooting the GMU. This unfortunately
> means that the CPU needs to be marginally aware of the GX domain
> even though it is expected to usually keep its hands off.
> 
> To support this we create a semi-disabled GX power domain that
> does nothing to the hardware on power up but tries to shut it
> down normally on power down. In this method the reference counting
> is correct and we can step in with the pm_runtime_put() at the right
> time during the failure path.
> 
> This patch sets up the connection to the GX power domain and does
> the magic to "enable" and disable it at the right points.
> 
> Signed-off-by: Jordan Crouse 

The pm_runtime gymnastics is scary! But I'm willing to go with it.

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 41 ++-
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  2 ++
>  2 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 51493f409358..ca71709efc94 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -1142,9 +1169,15 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
> if (IS_ERR_OR_NULL(gmu->mmio))
> return;
>  
> -   pm_runtime_disable(gmu->dev);
> a6xx_gmu_stop(a6xx_gpu);
>  
> +   pm_runtime_disable(gmu->dev);
> +
> +   if (!IS_ERR(gmu->gxpd)) {
> +   pm_runtime_disable(gmu->gxpd);
> +   dev_pm_domain_detach(gmu->gxpd, false);
> +   }
> +
> a6xx_gmu_irq_disable(gmu);
> a6xx_gmu_memory_free(gmu, gmu->hfi);
>  
> @@ -1203,6 +1236,12 @@ int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct 
> device_node *node)
> if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
> goto err;
>  
> +   /*
> +* Get a link to the GX power domain to reset the GPU in case of GMU
> +* crash
> +*/
> +   gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");

Are there a6xx devices that don't have this genpd? Just curious if we
could always assume that if it's an a6xx gpu then this must be there and
we can always call pm_runtime_get/put on it and avoid the if (!IS_ERR())
checks.

> +
> /* Get the power levels for the GMU and GPU */
> a6xx_gmu_pwrlevels_probe(gmu);
>  
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-20 Thread Tomasz Figa
Hi Jordan, Vivek,

On Wed, Nov 21, 2018 at 12:41 AM Jordan Crouse  wrote:
>
> On Tue, Nov 20, 2018 at 03:24:37PM +0530, Vivek Gautam wrote:
> > dma_map_sg() expects a DMA domain. However, the drm devices
> > have been traditionally using unmanaged iommu domain which
> > is non-dma type. Using dma mapping APIs with that domain is bad.
> >
> > Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
> > to do the cache maintenance.
> >
> > Signed-off-by: Vivek Gautam 
> > Suggested-by: Tomasz Figa 
> > ---
> >
> > Tested on an MTP sdm845:
> > https://github.com/vivekgautam1/linux/tree/v4.19/sdm845-mtp-display-working
> >
> >  drivers/gpu/drm/msm/msm_gem.c | 27 ---
> >  1 file changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > index 00c795ced02c..d7a7af610803 100644
> > --- a/drivers/gpu/drm/msm/msm_gem.c
> > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > @@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
> >   struct drm_device *dev = obj->dev;
> >   struct page **p;
> >   int npages = obj->size >> PAGE_SHIFT;
> > + struct scatterlist *s;
> > + int i;
> >
> >   if (use_pages(obj))
> >   p = drm_gem_get_pages(obj);
> > @@ -107,9 +109,19 @@ static struct page **get_pages(struct drm_gem_object 
> > *obj)
> >   /* For non-cached buffers, ensure the new pages are clean
> >* because display controller, GPU, etc. are not coherent:
> >*/
> > - if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> > - dma_map_sg(dev->dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, 
> > DMA_BIDIRECTIONAL);
> > + if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED)) {
> > + /*
> > +  * Fake up the SG table so that dma_sync_sg_*()
> > +  * can be used to flush the pages associated with it.
> > +  */
>
> We aren't really faking.  The table is real, we are just slightly abusing the
> sg_dma_address() which makes this comment a bit misleading. Instead I would
> probably say something like:
>
> /* dma_sync_sg_* flushes pages using sg_dma_address() so point it at the
>  * physical page for the right behavior */
>
> Or something like that.
>

It's actually quite complicated, but I agree that the comment isn't
very precise. The cases are as follows:
- arm64 iommu_dma_ops use sg_phys()
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm64/mm/dma-mapping.c#L599
- swiotlb_dma_ops used on arm64 if no IOMMU is available use
sg->dma_address directly:
https://elixir.bootlin.com/linux/v4.20-rc3/source/kernel/dma/swiotlb.c#L832
- arm_dma_ops use sg_dma_address():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1130
- arm iommu_ops use sg_page():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1869

Sounds like a mess...

> > + for_each_sg(msm_obj->sgt->sgl, s,
> > + msm_obj->sgt->nents, i)
> > + sg_dma_address(s) = sg_phys(s);
> > +
>
> I'm wondering - wouldn't we want to do this association for cached buffers to 
> so
> we could sync them correctly in cpu_prep and cpu_fini?  Maybe it wouldn't hurt
> to put this association in the main path (obviously the sync should stay 
> inside
> the conditional for uncached buffers).
>

I guess it wouldn't hurt indeed. Note that cpu_prep/fini seem to be
missing the sync call currently.

P.S. Jordan, not sure if it's my Gmail or your email client, but your
message had all the recipients in a Reply-to header, except you, so
pressing Reply to all in my case led to a message that didn't have you
in recipients anymore...

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/2] drm/msm/dsi: Get PHY ref clock from the DT

2018-11-20 Thread Doug Anderson
Hi,

On Wed, Nov 14, 2018 at 3:56 PM Matthias Kaehlcke  wrote:
>
> On Thu, Nov 08, 2018 at 02:04:31PM -0800, Doug Anderson wrote:
> > Hi,
> >
> > On Fri, Nov 2, 2018 at 2:45 PM Matthias Kaehlcke  wrote:
> > >
> > > Get the PHY ref clock from the device tree instead of hardcoding
> > > its name and rate.
> > >
> > > Signed-off-by: Matthias Kaehlcke 
> > > ---
> > >  drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c | 11 ++-
> > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c 
> > > b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
> > > index 4c03f0b7343ed..1016eb50df8f5 100644
> > > --- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
> > > +++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
> > > @@ -91,6 +91,8 @@ struct dsi_pll_10nm {
> > > void __iomem *phy_cmn_mmio;
> > > void __iomem *mmio;
> > >
> > > +   struct clk *vco_ref_clk;
> > > +
> > > u64 vco_ref_clk_rate;
> > > u64 vco_current_rate;
> > >
> > > @@ -630,7 +632,8 @@ static int pll_10nm_register(struct dsi_pll_10nm 
> > > *pll_10nm)
> > > char clk_name[32], parent[32], vco_name[32];
> > > char parent2[32], parent3[32], parent4[32];
> > > struct clk_init_data vco_init = {
> > > -   .parent_names = (const char *[]){ "xo" },
> > > +   .parent_names = (const char *[]){
> > > +   __clk_get_name(pll_10nm->vco_ref_clk) },
> > > .num_parents = 1,
> > > .name = vco_name,
> > > .flags = CLK_IGNORE_UNUSED,
> > > @@ -786,6 +789,12 @@ struct msm_dsi_pll *msm_dsi_pll_10nm_init(struct 
> > > platform_device *pdev, int id)
> > > pll_10nm->id = id;
> > > pll_10nm_list[id] = pll_10nm;
> > >
> > > +   pll_10nm->vco_ref_clk = devm_clk_get(>dev, "ref");
> > > +   if (IS_ERR(pll_10nm->vco_ref_clk)) {
> > > +   dev_err(>dev, "couldn't get 'ref' clock\n");
> > > +   return (void *)pll_10nm->vco_ref_clk;
> > > +   }
> >
> > So, u.  Can you follow the same pattern for all the other clocks
> > in this file too?  All parents should get their name based on
> > references in the device tree.
> >
> > It turns out that right now we have a mismatch because
> > "drivers/clk/qcom/dispcc-sdm845.c" calls "dsi0pllbyte"
> > "dsi0_phy_pll_out_byteclk" and calls "dsi0pll"
> > "dsi0_phy_pll_out_dsiclk".  We might want to change the names in
> > dispcc-sdm845.c, but it wouldn't matter if we simply didn't hardcode
> > them here.
>
> Hm, I understand the problem, but not quite what you mean with 'follow
> the same pattern'. The VCO ref clock is an 'external'/existing clock,
> hence it can be specificed in the DT and obtained with
> _clk_get(). However the clocks you mention above are 'created' by the
> PHY driver, so we could only specify their names in the DT, not sure
> if that's what you are suggesting. I guess 'clock-output-names' could
> be used, though it isn't really useful to describe the names in a
> clock tree. If you still think this should be done please share how
> you envision the DT entries to look.

Ah.  Right.  This is me being dumb.  As you said the
"dsi0_phy_pll_out_byteclk" and "dsi0_phy_pll_out_dsiclk" clocks are
backwards of the one you're dealing with here.  No no change needed in
your patch for this.

In theory we could do a follow-up patch where the dispcc-sdm845
bindings take phandle references to the PHY clock for the clocks it
consumes.  That would future proof the bindings but I believe it
wouldn't really be possible to use them right now in code since the
clock framework doesn't really handle cases where two drivers mutually
produce / consume clocks from each other.


-Doug
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v2 5/5] drm/msm: subclass work object for vblank events

2018-11-20 Thread Jeykumar Sankaran

On 2018-11-07 07:55, Sean Paul wrote:

On Tue, Nov 06, 2018 at 02:36:30PM -0800, Jeykumar Sankaran wrote:

msm maintains a separate structure to define vblank
work definitions and a list to track events submitted
to the workqueue. We can avoid this redundant list
and its protection mechanism, if we subclass the
work object to encapsulate vblank event parameters.

changes in v2:
- subclass optimization on system wq (Sean Paul)


I wouldn't do it like this, tbh. One problem is that you've lost your
flush() on
unbind, so there's no way to know if you have workers in the wild 
waiting

to
enable/disable vblank.

Another issues is that AFAICT, we don't need a queue of 
enables/disables,

but
rather just the last requested state (ie: should we be on or off). So
things
don't need to be this complicated (and we're possibly thrashing vblank
on/off
for no reason).

I'm still of the mind that you should just make this synchronous and be
done
with the threads (especially since we're still uncovering/introducing
races!).


While scoping out the effort to make vblank events synchronous, I found
that the spinlock locking order of vblank request sequence and vblank 
callback

sequences are the opposite.

In DPU, drm_vblank_enable acquires vblank_time_lock before registering
the crtc to encoder which happens after acquiring encoder_spinlock. But
the vblank_callback acquires encoder_spinlock before accessing the 
registered
crtc and calling into drm_vblank_handler which tries to acquire 
vblank_time_lock.

Acquiring both vblank_time_lock and encoder_spinlock in the same thread
is leading to deadlock.

In MDP5, I see the same pattern between vblank_time_lock and list_lock 
which

is used to track the irq handlers.

I believe that explains why msm_drv is queuing the vblank enable/disable
works to WQ after acquiring vblank_time_lock.

Thanks,
Jeykumar S.


Sean



Signed-off-by: Jeykumar Sankaran 
---
 drivers/gpu/drm/msm/msm_drv.c | 67

+--

 drivers/gpu/drm/msm/msm_drv.h |  7 -
 2 files changed, 20 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c

b/drivers/gpu/drm/msm/msm_drv.c

index 6d6c73b..8da5be2 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -203,61 +203,44 @@ u32 msm_readl(const void __iomem *addr)
return val;
 }

-struct vblank_event {
-   struct list_head node;
+struct msm_vblank_work {
+   struct work_struct work;
int crtc_id;
bool enable;
+   struct msm_drm_private *priv;
 };

 static void vblank_ctrl_worker(struct work_struct *work)
 {
-   struct msm_vblank_ctrl *vbl_ctrl = container_of(work,
-   struct msm_vblank_ctrl,

work);

-   struct msm_drm_private *priv = container_of(vbl_ctrl,
-   struct msm_drm_private,

vblank_ctrl);

+   struct msm_vblank_work *vbl_work = container_of(work,
+   struct msm_vblank_work,

work);

+   struct msm_drm_private *priv = vbl_work->priv;
struct msm_kms *kms = priv->kms;
-   struct vblank_event *vbl_ev, *tmp;
-   unsigned long flags;
-
-   spin_lock_irqsave(_ctrl->lock, flags);
-   list_for_each_entry_safe(vbl_ev, tmp, _ctrl->event_list, node)

{

-   list_del(_ev->node);
-   spin_unlock_irqrestore(_ctrl->lock, flags);
-
-   if (vbl_ev->enable)
-   kms->funcs->enable_vblank(kms,
-

priv->crtcs[vbl_ev->crtc_id]);

-   else
-   kms->funcs->disable_vblank(kms,
-

priv->crtcs[vbl_ev->crtc_id]);


-   kfree(vbl_ev);
-
-   spin_lock_irqsave(_ctrl->lock, flags);
-   }
+   if (vbl_work->enable)
+   kms->funcs->enable_vblank(kms,

priv->crtcs[vbl_work->crtc_id]);

+   else
+   kms->funcs->disable_vblank(kms,

priv->crtcs[vbl_work->crtc_id]);


-   spin_unlock_irqrestore(_ctrl->lock, flags);
+   kfree(vbl_work);
 }

 static int vblank_ctrl_queue_work(struct msm_drm_private *priv,
int crtc_id, bool enable)
 {
-   struct msm_vblank_ctrl *vbl_ctrl = >vblank_ctrl;
-   struct vblank_event *vbl_ev;
-   unsigned long flags;
+   struct msm_vblank_work *vbl_work;

-   vbl_ev = kzalloc(sizeof(*vbl_ev), GFP_ATOMIC);
-   if (!vbl_ev)
+   vbl_work = kzalloc(sizeof(*vbl_work), GFP_ATOMIC);
+   if (!vbl_work)
return -ENOMEM;

-   vbl_ev->crtc_id = crtc_id;
-   vbl_ev->enable = enable;
+   INIT_WORK(_work->work, vblank_ctrl_worker);

-   spin_lock_irqsave(_ctrl->lock, flags);
-   list_add_tail(_ev->node, _ctrl->event_list);
-   spin_unlock_irqrestore(_ctrl->lock, flags);
+   vbl_work->crtc_id = crtc_id;
+   vbl_work->enable = enable;
+   vbl_work->priv = priv;

-   schedule_work(_ctrl->work);
+   

Re: [Freedreno] [PATCH 1/4] drm/edid: Pass connector to AVI inforframe functions

2018-11-20 Thread Thierry Reding
On Tue, Nov 20, 2018 at 06:13:42PM +0200, Ville Syrjala wrote:
> From: Ville Syrjälä 
> 
> Make life easier for drivers by simply passing the connector
> to drm_hdmi_avi_infoframe_from_display_mode() and
> drm_hdmi_avi_infoframe_quant_range(). That way drivers don't
> need to worry about is_hdmi2_sink mess.
> 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "David (ChunMing) Zhou" 
> Cc: Archit Taneja 
> Cc: Andrzej Hajda 
> Cc: Laurent Pinchart 
> Cc: Inki Dae 
> Cc: Joonyoung Shim 
> Cc: Seung-Woo Kim 
> Cc: Kyungmin Park 
> Cc: Russell King 
> Cc: CK Hu 
> Cc: Philipp Zabel 
> Cc: Rob Clark 
> Cc: Ben Skeggs 
> Cc: Tomi Valkeinen 
> Cc: Sandy Huang 
> Cc: "Heiko Stübner" 
> Cc: Benjamin Gaignard 
> Cc: Vincent Abriou 
> Cc: Thierry Reding 
> Cc: Eric Anholt 
> Cc: Shawn Guo 
> Cc: Ilia Mirkin 
> Cc: amd-...@lists.freedesktop.org
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedreno@lists.freedesktop.org
> Cc: nouv...@lists.freedesktop.org
> Cc: linux-te...@vger.kernel.org
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c|  2 +-
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |  3 ++-
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |  2 +-
>  drivers/gpu/drm/bridge/analogix-anx78xx.c |  5 ++--
>  drivers/gpu/drm/bridge/sii902x.c  |  3 ++-
>  drivers/gpu/drm/bridge/sil-sii8620.c  |  3 +--
>  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c |  3 ++-
>  drivers/gpu/drm/drm_edid.c| 33 ++-
>  drivers/gpu/drm/exynos/exynos_hdmi.c  |  3 ++-
>  drivers/gpu/drm/i2c/tda998x_drv.c |  3 ++-
>  drivers/gpu/drm/i915/intel_hdmi.c | 14 +-
>  drivers/gpu/drm/i915/intel_lspcon.c   | 15 ++-
>  drivers/gpu/drm/i915/intel_sdvo.c | 10 ---
>  drivers/gpu/drm/mediatek/mtk_hdmi.c   |  3 ++-
>  drivers/gpu/drm/msm/hdmi/hdmi_bridge.c|  3 ++-
>  drivers/gpu/drm/nouveau/dispnv50/disp.c   |  7 +++--
>  drivers/gpu/drm/omapdrm/omap_encoder.c|  5 ++--
>  drivers/gpu/drm/radeon/radeon_audio.c |  2 +-
>  drivers/gpu/drm/rockchip/inno_hdmi.c  |  4 ++-
>  drivers/gpu/drm/sti/sti_hdmi.c|  3 ++-
>  drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c|  3 ++-
>  drivers/gpu/drm/tegra/hdmi.c  |  3 ++-
>  drivers/gpu/drm/tegra/sor.c   |  3 ++-
>  drivers/gpu/drm/vc4/vc4_hdmi.c| 11 +---
>  drivers/gpu/drm/zte/zx_hdmi.c |  4 ++-
>  include/drm/drm_edid.h|  8 +++---
>  27 files changed, 94 insertions(+), 66 deletions(-)

That's actually a lot nicer:

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH 1/4] drm/edid: Pass connector to AVI inforframe functions

2018-11-20 Thread Ville Syrjala
From: Ville Syrjälä 

Make life easier for drivers by simply passing the connector
to drm_hdmi_avi_infoframe_from_display_mode() and
drm_hdmi_avi_infoframe_quant_range(). That way drivers don't
need to worry about is_hdmi2_sink mess.

Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "David (ChunMing) Zhou" 
Cc: Archit Taneja 
Cc: Andrzej Hajda 
Cc: Laurent Pinchart 
Cc: Inki Dae 
Cc: Joonyoung Shim 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Russell King 
Cc: CK Hu 
Cc: Philipp Zabel 
Cc: Rob Clark 
Cc: Ben Skeggs 
Cc: Tomi Valkeinen 
Cc: Sandy Huang 
Cc: "Heiko Stübner" 
Cc: Benjamin Gaignard 
Cc: Vincent Abriou 
Cc: Thierry Reding 
Cc: Eric Anholt 
Cc: Shawn Guo 
Cc: Ilia Mirkin 
Cc: amd-...@lists.freedesktop.org
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: linux-te...@vger.kernel.org
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c|  2 +-
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |  3 ++-
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |  2 +-
 drivers/gpu/drm/bridge/analogix-anx78xx.c |  5 ++--
 drivers/gpu/drm/bridge/sii902x.c  |  3 ++-
 drivers/gpu/drm/bridge/sil-sii8620.c  |  3 +--
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c |  3 ++-
 drivers/gpu/drm/drm_edid.c| 33 ++-
 drivers/gpu/drm/exynos/exynos_hdmi.c  |  3 ++-
 drivers/gpu/drm/i2c/tda998x_drv.c |  3 ++-
 drivers/gpu/drm/i915/intel_hdmi.c | 14 +-
 drivers/gpu/drm/i915/intel_lspcon.c   | 15 ++-
 drivers/gpu/drm/i915/intel_sdvo.c | 10 ---
 drivers/gpu/drm/mediatek/mtk_hdmi.c   |  3 ++-
 drivers/gpu/drm/msm/hdmi/hdmi_bridge.c|  3 ++-
 drivers/gpu/drm/nouveau/dispnv50/disp.c   |  7 +++--
 drivers/gpu/drm/omapdrm/omap_encoder.c|  5 ++--
 drivers/gpu/drm/radeon/radeon_audio.c |  2 +-
 drivers/gpu/drm/rockchip/inno_hdmi.c  |  4 ++-
 drivers/gpu/drm/sti/sti_hdmi.c|  3 ++-
 drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c|  3 ++-
 drivers/gpu/drm/tegra/hdmi.c  |  3 ++-
 drivers/gpu/drm/tegra/sor.c   |  3 ++-
 drivers/gpu/drm/vc4/vc4_hdmi.c| 11 +---
 drivers/gpu/drm/zte/zx_hdmi.c |  4 ++-
 include/drm/drm_edid.h|  8 +++---
 27 files changed, 94 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
index 4cfecdce29a3..1f0426d2fc2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@@ -1682,7 +1682,7 @@ static void dce_v10_0_afmt_setmode(struct drm_encoder 
*encoder,
dce_v10_0_audio_write_sad_regs(encoder);
dce_v10_0_audio_write_latency_fields(encoder, mode);
 
-   err = drm_hdmi_avi_infoframe_from_display_mode(, mode, false);
+   err = drm_hdmi_avi_infoframe_from_display_mode(, connector, mode);
if (err < 0) {
DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
index 7c868916d90f..2280b971d758 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@@ -1724,7 +1724,7 @@ static void dce_v11_0_afmt_setmode(struct drm_encoder 
*encoder,
dce_v11_0_audio_write_sad_regs(encoder);
dce_v11_0_audio_write_latency_fields(encoder, mode);
 
-   err = drm_hdmi_avi_infoframe_from_display_mode(, mode, false);
+   err = drm_hdmi_avi_infoframe_from_display_mode(, connector, mode);
if (err < 0) {
DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index 17eaaba36017..db443ec53d3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -1423,6 +1423,7 @@ static void dce_v6_0_audio_set_avi_infoframe(struct 
drm_encoder *encoder,
struct amdgpu_device *adev = dev->dev_private;
struct amdgpu_encoder *amdgpu_encoder = to_amdgpu_encoder(encoder);
struct amdgpu_encoder_atom_dig *dig = amdgpu_encoder->enc_priv;
+   struct drm_connector *connector = 
amdgpu_get_connector_for_encoder(encoder);
struct hdmi_avi_infoframe frame;
u8 buffer[HDMI_INFOFRAME_HEADER_SIZE + HDMI_AVI_INFOFRAME_SIZE];
uint8_t *payload = buffer + 3;
@@ -1430,7 +1431,7 @@ static void dce_v6_0_audio_set_avi_infoframe(struct 
drm_encoder *encoder,
ssize_t err;
u32 tmp;
 
-   err = drm_hdmi_avi_infoframe_from_display_mode(, mode, false);
+   err = drm_hdmi_avi_infoframe_from_display_mode(, connector, mode);
if (err < 0) {
DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
return;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c 

Re: [Freedreno] [PATCH 4/4] drm/msm: Optimize adreno_show_object()

2018-11-20 Thread Jordan Crouse
On Tue, Nov 20, 2018 at 05:07:31PM +0530, Sharat Masetty wrote:
> When the userspace tries to read the crashstate dump, the read side
> implementation in the driver currently ascii85 encodes all the binary
> buffers and it does this each time the read system call is called.
> A userspace tool like cat typically does a page by page read and the
> number of read calls depends on the size of the data captured by the
> driver. This is certainly not desirable and does not scale well with
> large captures.
> 
> This patch encodes the buffer only once in the read path. With this there
> is an immediate >10X speed improvement in crashstate save time.
> 
> Signed-off-by: Sharat Masetty 

Reviewed-by: Jordan Crouse 

I like this a lot.

> ---
> Changes from v1:
>   Addressed comments from Jordan Crouse
> 
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 80 
> -
>  drivers/gpu/drm/msm/msm_gpu.h   |  1 +
>  2 files changed, 60 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index bbf8d3e..7749967 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -441,11 +441,15 @@ void adreno_gpu_state_destroy(struct msm_gpu_state 
> *state)
>  {
>   int i;
> 
> - for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> + for (i = 0; i < ARRAY_SIZE(state->ring); i++) {
>   kvfree(state->ring[i].bo.data);
> + kvfree(state->ring[i].bo.encoded);
> + }
> 
> - for (i = 0; state->bos && i < state->nr_bos; i++)
> + for (i = 0; state->bos && i < state->nr_bos; i++) {
>   kvfree(state->bos[i].data);
> + kvfree(state->bos[i].encoded);
> + }
> 
>   kfree(state->bos);
>   kfree(state->comm);
> @@ -472,34 +476,70 @@ int adreno_gpu_state_put(struct msm_gpu_state *state)
> 
>  #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)
> 
> -static void adreno_show_object(struct drm_printer *p, u32 *ptr, int len)
> +static char *adreno_gpu_ascii85_encode(u32 *src, size_t len)
>  {
> - char out[ASCII85_BUFSZ];
> - long l, datalen, i;
> + void *buf;
> + size_t buf_itr = 0;
> + long i, l;
> 
> - if (!ptr || !len)
> - return;
> + if (!len)
> + return NULL;
> +
> + l = ascii85_encode_len(len);
> 
>   /*
> -  * Only dump the non-zero part of the buffer - rarely will any data
> -  * completely fill the entire allocated size of the buffer
> +  * ascii85 outputs either a 5 byte string or a 1 byte string. So we
> +  * account for the worst case of 5 bytes per dword plus the 1 for '\0'
>*/
> - for (datalen = 0, i = 0; i < len >> 2; i++) {
> - if (ptr[i])
> - datalen = (i << 2) + 1;
> + buf = kvmalloc((l * 5) + 1, GFP_KERNEL);
> + if (!buf)
> + return NULL;
> +
> + for (i = 0; i < l; i++) {
> + ascii85_encode(src[i], buf + buf_itr);
> +
> + if (src[i] == 0)
> + buf_itr += 1;
> + else
> + buf_itr += 5;
>   }
> 
> - /* Skip printing the object if it is empty */
> - if (datalen == 0)
> + return buf;
> +}
> +
> +static void adreno_show_object(struct drm_printer *p,
> + struct msm_gpu_state_bo *bo)
> +{
> + if ((!bo->data && !bo->encoded) || !bo->size)
>   return;
> 
> - l = ascii85_encode_len(datalen);
> + if (!bo->encoded) {
> + long datalen, i;
> + u32 *buf = bo->data;
> +
> + /*
> +  * Only dump the non-zero part of the buffer - rarely will
> +  * any data completely fill the entire allocated size of
> +  * the buffer.
> +  */
> + for (datalen = 0, i = 0; i < (bo->size) >> 2; i++) {
> + if (buf[i])
> + datalen = ((i + 1) << 2);
> + }
> +
> + bo->encoded = adreno_gpu_ascii85_encode(buf, datalen);
> +
> + kvfree(bo->data);
> + bo->data = NULL;
> +
> + if (!bo->encoded)
> + return;
> + }
> 
>   drm_puts(p, "data: !!ascii85 |\n");
>   drm_puts(p, " ");
> 
> - for (i = 0; i < l; i++)
> - drm_puts(p, ascii85_encode(ptr[i], out));
> + drm_puts(p, bo->encoded);
> 
>   drm_puts(p, "\n");
>  }
> @@ -531,8 +571,7 @@ void adreno_show(struct msm_gpu *gpu, struct 
> msm_gpu_state *state,
>   drm_printf(p, "wptr: %d\n", state->ring[i].wptr);
>   drm_printf(p, "size: %d\n", MSM_GPU_RINGBUFFER_SZ);
> 
> - adreno_show_object(p, state->ring[i].bo.data,
> - state->ring[i].bo.size);
> + adreno_show_object(p, &(state->ring[i].bo));
>   }
> 
>   if (state->bos) {
> @@ -543,8 +582,7 @@ void adreno_show(struct 

Re: [Freedreno] [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data

2018-11-20 Thread Jordan Crouse
On Tue, Nov 20, 2018 at 05:07:30PM +0530, Sharat Masetty wrote:
> The ring substructure in msm_gpu_state is an extension of
> msm_gpu_state_bo, so this patch changes the ring structure
> to reuse the msm_gpu_state_bo as a base class, instead of
> redefining the required variables.
> 
> Signed-off-by: Sharat Masetty 

Looks okay.  There might be some code consolidation to be had here but perhaps
you've already addressed this in a future patch or we can do it as a rainy day
cleanup.

Reviewed-by: Jordan Crouse 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 20 +++-
>  drivers/gpu/drm/msm/msm_gpu.h   |  4 +---
>  2 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index 6ebe842..bbf8d3e 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -383,7 +383,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
> msm_gpu_state *state)
>   int size = 0, j;
>  
>   state->ring[i].fence = gpu->rb[i]->memptrs->fence;
> - state->ring[i].iova = gpu->rb[i]->iova;
> + state->ring[i].bo.iova = gpu->rb[i]->iova;
>   state->ring[i].seqno = gpu->rb[i]->seqno;
>   state->ring[i].rptr = get_rptr(adreno_gpu, gpu->rb[i]);
>   state->ring[i].wptr = get_wptr(gpu->rb[i]);
> @@ -397,10 +397,12 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
> msm_gpu_state *state)
>   size = j + 1;
>  
>   if (size) {
> - state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
> - if (state->ring[i].data) {
> - memcpy(state->ring[i].data, gpu->rb[i]->start, 
> size << 2);
> - state->ring[i].data_size = size << 2;
> + state->ring[i].bo.data =
> + kvmalloc(size << 2, GFP_KERNEL);
> + if (state->ring[i].bo.data) {
> + memcpy(state->ring[i].bo.data,
> + gpu->rb[i]->start, size << 2);
> + state->ring[i].bo.size = size << 2;

Today I learned there that kvmemdup() does not exist and I was sad about it.

>   }
>   }
>   }
> @@ -440,7 +442,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
>   int i;
>  
>   for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> - kvfree(state->ring[i].data);
> + kvfree(state->ring[i].bo.data);
>  
>   for (i = 0; state->bos && i < state->nr_bos; i++)
>   kvfree(state->bos[i].data);
> @@ -522,15 +524,15 @@ void adreno_show(struct msm_gpu *gpu, struct 
> msm_gpu_state *state,
>  
>   for (i = 0; i < gpu->nr_rings; i++) {
>   drm_printf(p, "  - id: %d\n", i);
> - drm_printf(p, "iova: 0x%016llx\n", state->ring[i].iova);
> + drm_printf(p, "iova: 0x%016llx\n", state->ring[i].bo.iova);
>   drm_printf(p, "last-fence: %d\n", state->ring[i].seqno);
>   drm_printf(p, "retired-fence: %d\n", state->ring[i].fence);
>   drm_printf(p, "rptr: %d\n", state->ring[i].rptr);
>   drm_printf(p, "wptr: %d\n", state->ring[i].wptr);
>   drm_printf(p, "size: %d\n", MSM_GPU_RINGBUFFER_SZ);
>  
> - adreno_show_object(p, state->ring[i].data,
> - state->ring[i].data_size);
> + adreno_show_object(p, state->ring[i].bo.data,
> + state->ring[i].bo.size);
>   }
>  
>   if (state->bos) {
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 7dc775f..a3a6371 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -198,13 +198,11 @@ struct msm_gpu_state {
>   struct timeval time;
>  
>   struct {
> - u64 iova;
>   u32 fence;
>   u32 seqno;
>   u32 rptr;
>   u32 wptr;
> - void *data;
> - int data_size;
> + struct msm_gpu_state_bo bo;
>   } ring[MSM_GPU_MAX_RINGS];
>  
>   int nr_registers;
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/4] include/linux/ascii85: Update ascii85_encode()

2018-11-20 Thread Jordan Crouse
On Tue, Nov 20, 2018 at 05:07:29PM +0530, Sharat Masetty wrote:
> The current implementation of ascii85_encode() does not copy the encoded
> buffer 'z' to the output buffer in case the input is zero. This patch
> simply adds this missing piece. This makes it easier to use this
> function to encode large buffers.
> 
> Signed-off-by: Sharat Masetty 

Adding intel-gfx since they are the other consumers of this code but this seems
legit to me.

Jordan

> ---
>  include/linux/ascii85.h | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ascii85.h b/include/linux/ascii85.h
> index 4cc4020..00646fc 100644
> --- a/include/linux/ascii85.h
> +++ b/include/linux/ascii85.h
> @@ -23,8 +23,12 @@
>  {
>   int i;
>  
> - if (in == 0)
> - return "z";
> + if (in == 0) {
> + out[0] = 'z';
> + out[1] = '\0';
> +
> + return out;
> + }
>  
>   out[5] = '\0';
>   for (i = 5; i--; ) {
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate

2018-11-20 Thread Jordan Crouse
On Tue, Nov 20, 2018 at 05:07:28PM +0530, Sharat Masetty wrote:
> The ringbuffer data to capture at crashtime can end up being large
> sometimes, and the size can vary from being less than a page to the
> full size of 32KB. So use the kvmalloc variant that perfectly fits the bill.
> 
> Signed-off-by: Sharat Masetty 

Reviewed-by: Jordan Crouse 

> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index d88d00d..6ebe842 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -397,7 +397,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
> msm_gpu_state *state)
>   size = j + 1;
>  
>   if (size) {
> - state->ring[i].data = kmalloc(size << 2, GFP_KERNEL);
> + state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
>   if (state->ring[i].data) {
>   memcpy(state->ring[i].data, gpu->rb[i]->start, 
> size << 2);
>   state->ring[i].data_size = size << 2;
> @@ -440,7 +440,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
>   int i;
>  
>   for (i = 0; i < ARRAY_SIZE(state->ring); i++)
> - kfree(state->ring[i].data);
> + kvfree(state->ring[i].data);
>  
>   for (i = 0; state->bos && i < state->nr_bos; i++)
>   kvfree(state->bos[i].data);
> -- 
> 1.9.1
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-20 Thread Jordan Crouse
On Tue, Nov 20, 2018 at 03:24:37PM +0530, Vivek Gautam wrote:
> dma_map_sg() expects a DMA domain. However, the drm devices
> have been traditionally using unmanaged iommu domain which
> is non-dma type. Using dma mapping APIs with that domain is bad.
> 
> Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
> to do the cache maintenance.
> 
> Signed-off-by: Vivek Gautam 
> Suggested-by: Tomasz Figa 
> ---
> 
> Tested on an MTP sdm845:
> https://github.com/vivekgautam1/linux/tree/v4.19/sdm845-mtp-display-working
> 
>  drivers/gpu/drm/msm/msm_gem.c | 27 ---
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 00c795ced02c..d7a7af610803 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
>   struct drm_device *dev = obj->dev;
>   struct page **p;
>   int npages = obj->size >> PAGE_SHIFT;
> + struct scatterlist *s;
> + int i;
>  
>   if (use_pages(obj))
>   p = drm_gem_get_pages(obj);
> @@ -107,9 +109,19 @@ static struct page **get_pages(struct drm_gem_object 
> *obj)
>   /* For non-cached buffers, ensure the new pages are clean
>* because display controller, GPU, etc. are not coherent:
>*/
> - if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> - dma_map_sg(dev->dev, msm_obj->sgt->sgl,
> - msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> + if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED)) {
> + /*
> +  * Fake up the SG table so that dma_sync_sg_*()
> +  * can be used to flush the pages associated with it.
> +  */

We aren't really faking.  The table is real, we are just slightly abusing the
sg_dma_address() which makes this comment a bit misleading. Instead I would
probably say something like:

/* dma_sync_sg_* flushes pages using sg_dma_address() so point it at the
 * physical page for the right behavior */

Or something like that.

> + for_each_sg(msm_obj->sgt->sgl, s,
> + msm_obj->sgt->nents, i)
> + sg_dma_address(s) = sg_phys(s);
> +

I'm wondering - wouldn't we want to do this association for cached buffers to so
we could sync them correctly in cpu_prep and cpu_fini?  Maybe it wouldn't hurt
to put this association in the main path (obviously the sync should stay inside
the conditional for uncached buffers).

> + dma_sync_sg_for_device(dev->dev, msm_obj->sgt->sgl,
> +msm_obj->sgt->nents,
> +DMA_TO_DEVICE);
> + }
>   }
>  
>   return msm_obj->pages;
> @@ -137,10 +149,11 @@ static void put_pages(struct drm_gem_object *obj)
>* pages are clean because display controller,
>* GPU, etc. are not coherent:
>*/
> - if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> - dma_unmap_sg(obj->dev->dev, msm_obj->sgt->sgl,
> -  msm_obj->sgt->nents,
> -  DMA_BIDIRECTIONAL);
> + if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
> + dma_sync_sg_for_cpu(obj->dev->dev,
> + msm_obj->sgt->sgl,
> + msm_obj->sgt->nents,
> + DMA_FROM_DEVICE);
>  
>   sg_free_table(msm_obj->sgt);
>   kfree(msm_obj->sgt);

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH 1/4] drm/msm: use kvmalloc for ring data in gpu crashstate

2018-11-20 Thread Sharat Masetty
The ringbuffer data to capture at crashtime can end up being large
sometimes, and the size can vary from being less than a page to the
full size of 32KB. So use the kvmalloc variant that perfectly fits the bill.

Signed-off-by: Sharat Masetty 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index d88d00d..6ebe842 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -397,7 +397,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
msm_gpu_state *state)
size = j + 1;
 
if (size) {
-   state->ring[i].data = kmalloc(size << 2, GFP_KERNEL);
+   state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
if (state->ring[i].data) {
memcpy(state->ring[i].data, gpu->rb[i]->start, 
size << 2);
state->ring[i].data_size = size << 2;
@@ -440,7 +440,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
int i;
 
for (i = 0; i < ARRAY_SIZE(state->ring); i++)
-   kfree(state->ring[i].data);
+   kvfree(state->ring[i].data);
 
for (i = 0; state->bos && i < state->nr_bos; i++)
kvfree(state->bos[i].data);
-- 
1.9.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH 4/4] drm/msm: Optimize adreno_show_object()

2018-11-20 Thread Sharat Masetty
When the userspace tries to read the crashstate dump, the read side
implementation in the driver currently ascii85 encodes all the binary
buffers and it does this each time the read system call is called.
A userspace tool like cat typically does a page by page read and the
number of read calls depends on the size of the data captured by the
driver. This is certainly not desirable and does not scale well with
large captures.

This patch encodes the buffer only once in the read path. With this there
is an immediate >10X speed improvement in crashstate save time.

Signed-off-by: Sharat Masetty 
---
Changes from v1:
Addressed comments from Jordan Crouse

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 80 -
 drivers/gpu/drm/msm/msm_gpu.h   |  1 +
 2 files changed, 60 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index bbf8d3e..7749967 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -441,11 +441,15 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
 {
int i;

-   for (i = 0; i < ARRAY_SIZE(state->ring); i++)
+   for (i = 0; i < ARRAY_SIZE(state->ring); i++) {
kvfree(state->ring[i].bo.data);
+   kvfree(state->ring[i].bo.encoded);
+   }

-   for (i = 0; state->bos && i < state->nr_bos; i++)
+   for (i = 0; state->bos && i < state->nr_bos; i++) {
kvfree(state->bos[i].data);
+   kvfree(state->bos[i].encoded);
+   }

kfree(state->bos);
kfree(state->comm);
@@ -472,34 +476,70 @@ int adreno_gpu_state_put(struct msm_gpu_state *state)

 #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)

-static void adreno_show_object(struct drm_printer *p, u32 *ptr, int len)
+static char *adreno_gpu_ascii85_encode(u32 *src, size_t len)
 {
-   char out[ASCII85_BUFSZ];
-   long l, datalen, i;
+   void *buf;
+   size_t buf_itr = 0;
+   long i, l;

-   if (!ptr || !len)
-   return;
+   if (!len)
+   return NULL;
+
+   l = ascii85_encode_len(len);

/*
-* Only dump the non-zero part of the buffer - rarely will any data
-* completely fill the entire allocated size of the buffer
+* ascii85 outputs either a 5 byte string or a 1 byte string. So we
+* account for the worst case of 5 bytes per dword plus the 1 for '\0'
 */
-   for (datalen = 0, i = 0; i < len >> 2; i++) {
-   if (ptr[i])
-   datalen = (i << 2) + 1;
+   buf = kvmalloc((l * 5) + 1, GFP_KERNEL);
+   if (!buf)
+   return NULL;
+
+   for (i = 0; i < l; i++) {
+   ascii85_encode(src[i], buf + buf_itr);
+
+   if (src[i] == 0)
+   buf_itr += 1;
+   else
+   buf_itr += 5;
}

-   /* Skip printing the object if it is empty */
-   if (datalen == 0)
+   return buf;
+}
+
+static void adreno_show_object(struct drm_printer *p,
+   struct msm_gpu_state_bo *bo)
+{
+   if ((!bo->data && !bo->encoded) || !bo->size)
return;

-   l = ascii85_encode_len(datalen);
+   if (!bo->encoded) {
+   long datalen, i;
+   u32 *buf = bo->data;
+
+   /*
+* Only dump the non-zero part of the buffer - rarely will
+* any data completely fill the entire allocated size of
+* the buffer.
+*/
+   for (datalen = 0, i = 0; i < (bo->size) >> 2; i++) {
+   if (buf[i])
+   datalen = ((i + 1) << 2);
+   }
+
+   bo->encoded = adreno_gpu_ascii85_encode(buf, datalen);
+
+   kvfree(bo->data);
+   bo->data = NULL;
+
+   if (!bo->encoded)
+   return;
+   }

drm_puts(p, "data: !!ascii85 |\n");
drm_puts(p, " ");

-   for (i = 0; i < l; i++)
-   drm_puts(p, ascii85_encode(ptr[i], out));
+   drm_puts(p, bo->encoded);

drm_puts(p, "\n");
 }
@@ -531,8 +571,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state 
*state,
drm_printf(p, "wptr: %d\n", state->ring[i].wptr);
drm_printf(p, "size: %d\n", MSM_GPU_RINGBUFFER_SZ);

-   adreno_show_object(p, state->ring[i].bo.data,
-   state->ring[i].bo.size);
+   adreno_show_object(p, &(state->ring[i].bo));
}

if (state->bos) {
@@ -543,8 +582,7 @@ void adreno_show(struct msm_gpu *gpu, struct msm_gpu_state 
*state,
state->bos[i].iova);
drm_printf(p, "size: %zd\n", state->bos[i].size);

-   adreno_show_object(p, state->bos[i].data,
-

[Freedreno] [PATCH 2/4] include/linux/ascii85: Update ascii85_encode()

2018-11-20 Thread Sharat Masetty
The current implementation of ascii85_encode() does not copy the encoded
buffer 'z' to the output buffer in case the input is zero. This patch
simply adds this missing piece. This makes it easier to use this
function to encode large buffers.

Signed-off-by: Sharat Masetty 
---
 include/linux/ascii85.h | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/linux/ascii85.h b/include/linux/ascii85.h
index 4cc4020..00646fc 100644
--- a/include/linux/ascii85.h
+++ b/include/linux/ascii85.h
@@ -23,8 +23,12 @@
 {
int i;
 
-   if (in == 0)
-   return "z";
+   if (in == 0) {
+   out[0] = 'z';
+   out[1] = '\0';
+
+   return out;
+   }
 
out[5] = '\0';
for (i = 5; i--; ) {
-- 
1.9.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH 3/4] drm/msm: Use msm_gpu_state_bo for ringbuffer data

2018-11-20 Thread Sharat Masetty
The ring substructure in msm_gpu_state is an extension of
msm_gpu_state_bo, so this patch changes the ring structure
to reuse the msm_gpu_state_bo as a base class, instead of
redefining the required variables.

Signed-off-by: Sharat Masetty 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 20 +++-
 drivers/gpu/drm/msm/msm_gpu.h   |  4 +---
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6ebe842..bbf8d3e 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -383,7 +383,7 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
msm_gpu_state *state)
int size = 0, j;
 
state->ring[i].fence = gpu->rb[i]->memptrs->fence;
-   state->ring[i].iova = gpu->rb[i]->iova;
+   state->ring[i].bo.iova = gpu->rb[i]->iova;
state->ring[i].seqno = gpu->rb[i]->seqno;
state->ring[i].rptr = get_rptr(adreno_gpu, gpu->rb[i]);
state->ring[i].wptr = get_wptr(gpu->rb[i]);
@@ -397,10 +397,12 @@ int adreno_gpu_state_get(struct msm_gpu *gpu, struct 
msm_gpu_state *state)
size = j + 1;
 
if (size) {
-   state->ring[i].data = kvmalloc(size << 2, GFP_KERNEL);
-   if (state->ring[i].data) {
-   memcpy(state->ring[i].data, gpu->rb[i]->start, 
size << 2);
-   state->ring[i].data_size = size << 2;
+   state->ring[i].bo.data =
+   kvmalloc(size << 2, GFP_KERNEL);
+   if (state->ring[i].bo.data) {
+   memcpy(state->ring[i].bo.data,
+   gpu->rb[i]->start, size << 2);
+   state->ring[i].bo.size = size << 2;
}
}
}
@@ -440,7 +442,7 @@ void adreno_gpu_state_destroy(struct msm_gpu_state *state)
int i;
 
for (i = 0; i < ARRAY_SIZE(state->ring); i++)
-   kvfree(state->ring[i].data);
+   kvfree(state->ring[i].bo.data);
 
for (i = 0; state->bos && i < state->nr_bos; i++)
kvfree(state->bos[i].data);
@@ -522,15 +524,15 @@ void adreno_show(struct msm_gpu *gpu, struct 
msm_gpu_state *state,
 
for (i = 0; i < gpu->nr_rings; i++) {
drm_printf(p, "  - id: %d\n", i);
-   drm_printf(p, "iova: 0x%016llx\n", state->ring[i].iova);
+   drm_printf(p, "iova: 0x%016llx\n", state->ring[i].bo.iova);
drm_printf(p, "last-fence: %d\n", state->ring[i].seqno);
drm_printf(p, "retired-fence: %d\n", state->ring[i].fence);
drm_printf(p, "rptr: %d\n", state->ring[i].rptr);
drm_printf(p, "wptr: %d\n", state->ring[i].wptr);
drm_printf(p, "size: %d\n", MSM_GPU_RINGBUFFER_SZ);
 
-   adreno_show_object(p, state->ring[i].data,
-   state->ring[i].data_size);
+   adreno_show_object(p, state->ring[i].bo.data,
+   state->ring[i].bo.size);
}
 
if (state->bos) {
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 7dc775f..a3a6371 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -198,13 +198,11 @@ struct msm_gpu_state {
struct timeval time;
 
struct {
-   u64 iova;
u32 fence;
u32 seqno;
u32 rptr;
u32 wptr;
-   void *data;
-   int data_size;
+   struct msm_gpu_state_bo bo;
} ring[MSM_GPU_MAX_RINGS];
 
int nr_registers;
-- 
1.9.1

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-20 Thread Vivek Gautam
dma_map_sg() expects a DMA domain. However, the drm devices
have been traditionally using unmanaged iommu domain which
is non-dma type. Using dma mapping APIs with that domain is bad.

Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
to do the cache maintenance.

Signed-off-by: Vivek Gautam 
Suggested-by: Tomasz Figa 
---

Tested on an MTP sdm845:
https://github.com/vivekgautam1/linux/tree/v4.19/sdm845-mtp-display-working

 drivers/gpu/drm/msm/msm_gem.c | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 00c795ced02c..d7a7af610803 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
struct drm_device *dev = obj->dev;
struct page **p;
int npages = obj->size >> PAGE_SHIFT;
+   struct scatterlist *s;
+   int i;
 
if (use_pages(obj))
p = drm_gem_get_pages(obj);
@@ -107,9 +109,19 @@ static struct page **get_pages(struct drm_gem_object *obj)
/* For non-cached buffers, ensure the new pages are clean
 * because display controller, GPU, etc. are not coherent:
 */
-   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
-   dma_map_sg(dev->dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED)) {
+   /*
+* Fake up the SG table so that dma_sync_sg_*()
+* can be used to flush the pages associated with it.
+*/
+   for_each_sg(msm_obj->sgt->sgl, s,
+   msm_obj->sgt->nents, i)
+   sg_dma_address(s) = sg_phys(s);
+
+   dma_sync_sg_for_device(dev->dev, msm_obj->sgt->sgl,
+  msm_obj->sgt->nents,
+  DMA_TO_DEVICE);
+   }
}
 
return msm_obj->pages;
@@ -137,10 +149,11 @@ static void put_pages(struct drm_gem_object *obj)
 * pages are clean because display controller,
 * GPU, etc. are not coherent:
 */
-   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
-   dma_unmap_sg(obj->dev->dev, msm_obj->sgt->sgl,
-msm_obj->sgt->nents,
-DMA_BIDIRECTIONAL);
+   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
+   dma_sync_sg_for_cpu(obj->dev->dev,
+   msm_obj->sgt->sgl,
+   msm_obj->sgt->nents,
+   DMA_FROM_DEVICE);
 
sg_free_table(msm_obj->sgt);
kfree(msm_obj->sgt);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno