Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915/gt: Add separate MOCS table for Gen12 devices other than TGL/RKL

2021-09-09 Thread Ville Syrjälä
On Thu, Sep 09, 2021 at 01:33:23PM -0700, Matt Roper wrote:
> On Thu, Sep 09, 2021 at 10:59:36PM +0300, Ville Syrjälä wrote:
> > On Thu, Sep 09, 2021 at 11:14:15AM -0700, Matt Roper wrote:
> > > On Thu, Sep 09, 2021 at 08:42:15PM +0300, Ville Syrjälä wrote:
> > > > On Thu, Sep 09, 2021 at 10:15:56AM -0700, Matt Roper wrote:
> > > > > On Thu, Sep 09, 2021 at 06:09:55PM +0300, Ville Syrjälä wrote:
> > > > > > On Thu, Sep 09, 2021 at 08:00:02AM -0700, Matt Roper wrote:
> > > > > > > On Thu, Sep 09, 2021 at 05:39:26PM +0300, Ville Syrjälä wrote:
> > > > > > > > On Thu, Sep 09, 2021 at 07:29:33AM -0700, Matt Roper wrote:
> > > > > > > > > On Thu, Sep 09, 2021 at 04:58:50PM +0300, Ville Syrjälä wrote:
> > > > > > > > > > On Tue, Sep 07, 2021 at 11:19:29AM -0700, Matt Roper wrote:
> > > > > > > > > > > On Tue, Sep 07, 2021 at 08:41:06PM +0300, Ville Syrjälä 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Tue, Sep 07, 2021 at 10:27:28AM -0700, Matt Roper 
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > On Tue, Sep 07, 2021 at 10:46:39PM +0530, Ayaz A 
> > > > > > > > > > > > > Siddiqui wrote:
> > > > > > > > > > > > > > MOCS table of TGL/RKL has MOCS[1] set to L3_UC.
> > > > > > > > > > > > > > While for other gen12 devices we need to set 
> > > > > > > > > > > > > > MOCS[1] as L3_WB,
> > > > > > > > > > > > > > So adding a new MOCS table for other gen 12 devices 
> > > > > > > > > > > > > > eg. ADL.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Fixes: cfbe5291a189 ("drm/i915/gt: Initialize 
> > > > > > > > > > > > > > unused MOCS entries with device specific values")
> > > > > > > > > > > > > > Cc: Matt Roper 
> > > > > > > > > > > > > > Signed-off-by: Ayaz A Siddiqui 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Yep, we overlooked that the TGL table still had an 
> > > > > > > > > > > > > explicit entry for
> > > > > > > > > > > > > I915_MOCS_PTE and wasn't just using an implicit 
> > > > > > > > > > > > > 'unused_entries' lookup
> > > > > > > > > > > > > for MOCS[1].  The new table is the same as the TGL 
> > > > > > > > > > > > > table, just with
> > > > > > > > > > > > > I915_MOCS_PTE (1) removed.
> > > > > > > > > > > > 
> > > > > > > > > > > > And just how are people planning on handling display 
> > > > > > > > > > > > cacheability
> > > > > > > > > > > > control without a PTE MOCS entry? Is Mesa/etc. already 
> > > > > > > > > > > > making all
> > > > > > > > > > > > external bos uncached on these platforms just in case 
> > > > > > > > > > > > we might
> > > > > > > > > > > > scan out said bo?
> > > > > > > > > > > 
> > > > > > > > > > > MOCS entry 1 has never been considered a valid MOCS table 
> > > > > > > > > > > entry on gen12
> > > > > > > > > > > platforms (despite the old #define, it's not actually 
> > > > > > > > > > > related to PTE,
> > > > > > > > > > > display, etc. anymore).
> > > > > > > > > > 
> > > > > > > > > > So can someone finally explain to me how we're supposed to 
> > > > > > > > > > cache
> > > > > > > > > > anything that might become a scanout buffer later (eg. 
> > > > > > > > > > window system
> > > > > > > > >

Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915/gt: Add separate MOCS table for Gen12 devices other than TGL/RKL

2021-09-09 Thread Ville Syrjälä
On Thu, Sep 09, 2021 at 11:14:15AM -0700, Matt Roper wrote:
> On Thu, Sep 09, 2021 at 08:42:15PM +0300, Ville Syrjälä wrote:
> > On Thu, Sep 09, 2021 at 10:15:56AM -0700, Matt Roper wrote:
> > > On Thu, Sep 09, 2021 at 06:09:55PM +0300, Ville Syrjälä wrote:
> > > > On Thu, Sep 09, 2021 at 08:00:02AM -0700, Matt Roper wrote:
> > > > > On Thu, Sep 09, 2021 at 05:39:26PM +0300, Ville Syrjälä wrote:
> > > > > > On Thu, Sep 09, 2021 at 07:29:33AM -0700, Matt Roper wrote:
> > > > > > > On Thu, Sep 09, 2021 at 04:58:50PM +0300, Ville Syrjälä wrote:
> > > > > > > > On Tue, Sep 07, 2021 at 11:19:29AM -0700, Matt Roper wrote:
> > > > > > > > > On Tue, Sep 07, 2021 at 08:41:06PM +0300, Ville Syrjälä wrote:
> > > > > > > > > > On Tue, Sep 07, 2021 at 10:27:28AM -0700, Matt Roper wrote:
> > > > > > > > > > > On Tue, Sep 07, 2021 at 10:46:39PM +0530, Ayaz A Siddiqui 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > MOCS table of TGL/RKL has MOCS[1] set to L3_UC.
> > > > > > > > > > > > While for other gen12 devices we need to set MOCS[1] as 
> > > > > > > > > > > > L3_WB,
> > > > > > > > > > > > So adding a new MOCS table for other gen 12 devices eg. 
> > > > > > > > > > > > ADL.
> > > > > > > > > > > > 
> > > > > > > > > > > > Fixes: cfbe5291a189 ("drm/i915/gt: Initialize unused 
> > > > > > > > > > > > MOCS entries with device specific values")
> > > > > > > > > > > > Cc: Matt Roper 
> > > > > > > > > > > > Signed-off-by: Ayaz A Siddiqui 
> > > > > > > > > > > 
> > > > > > > > > > > Yep, we overlooked that the TGL table still had an 
> > > > > > > > > > > explicit entry for
> > > > > > > > > > > I915_MOCS_PTE and wasn't just using an implicit 
> > > > > > > > > > > 'unused_entries' lookup
> > > > > > > > > > > for MOCS[1].  The new table is the same as the TGL table, 
> > > > > > > > > > > just with
> > > > > > > > > > > I915_MOCS_PTE (1) removed.
> > > > > > > > > > 
> > > > > > > > > > And just how are people planning on handling display 
> > > > > > > > > > cacheability
> > > > > > > > > > control without a PTE MOCS entry? Is Mesa/etc. already 
> > > > > > > > > > making all
> > > > > > > > > > external bos uncached on these platforms just in case we 
> > > > > > > > > > might
> > > > > > > > > > scan out said bo?
> > > > > > > > > 
> > > > > > > > > MOCS entry 1 has never been considered a valid MOCS table 
> > > > > > > > > entry on gen12
> > > > > > > > > platforms (despite the old #define, it's not actually related 
> > > > > > > > > to PTE,
> > > > > > > > > display, etc. anymore).
> > > > > > > > 
> > > > > > > > So can someone finally explain to me how we're supposed to cache
> > > > > > > > anything that might become a scanout buffer later (eg. window 
> > > > > > > > system
> > > > > > > > buffers)? Or are we just making everything like that UC now, 
> > > > > > > > and is
> > > > > > > > everyone happy with that? Is userspace actually following that?
> > > > > > > 
> > > > > > > Table entry #1 has never had anything to do with scanout on 
> > > > > > > gen12+.  I
> > > > > > > would assume that UMDs are either using the display entry in the 
> > > > > > > MOCS
> > > > > > > table (which is 61 on gen12+) or some other UC entry.
> > > > > > 
> > > > > > If 61 is meant to to be the new PTE entry wy hasn't it been defines 
> > > > > > as
> > > > > > such in the code? And I know for a fact that userspace

Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915/gt: Add separate MOCS table for Gen12 devices other than TGL/RKL

2021-09-09 Thread Ville Syrjälä
On Thu, Sep 09, 2021 at 10:15:56AM -0700, Matt Roper wrote:
> On Thu, Sep 09, 2021 at 06:09:55PM +0300, Ville Syrjälä wrote:
> > On Thu, Sep 09, 2021 at 08:00:02AM -0700, Matt Roper wrote:
> > > On Thu, Sep 09, 2021 at 05:39:26PM +0300, Ville Syrjälä wrote:
> > > > On Thu, Sep 09, 2021 at 07:29:33AM -0700, Matt Roper wrote:
> > > > > On Thu, Sep 09, 2021 at 04:58:50PM +0300, Ville Syrjälä wrote:
> > > > > > On Tue, Sep 07, 2021 at 11:19:29AM -0700, Matt Roper wrote:
> > > > > > > On Tue, Sep 07, 2021 at 08:41:06PM +0300, Ville Syrjälä wrote:
> > > > > > > > On Tue, Sep 07, 2021 at 10:27:28AM -0700, Matt Roper wrote:
> > > > > > > > > On Tue, Sep 07, 2021 at 10:46:39PM +0530, Ayaz A Siddiqui 
> > > > > > > > > wrote:
> > > > > > > > > > MOCS table of TGL/RKL has MOCS[1] set to L3_UC.
> > > > > > > > > > While for other gen12 devices we need to set MOCS[1] as 
> > > > > > > > > > L3_WB,
> > > > > > > > > > So adding a new MOCS table for other gen 12 devices eg. ADL.
> > > > > > > > > > 
> > > > > > > > > > Fixes: cfbe5291a189 ("drm/i915/gt: Initialize unused MOCS 
> > > > > > > > > > entries with device specific values")
> > > > > > > > > > Cc: Matt Roper 
> > > > > > > > > > Signed-off-by: Ayaz A Siddiqui 
> > > > > > > > > 
> > > > > > > > > Yep, we overlooked that the TGL table still had an explicit 
> > > > > > > > > entry for
> > > > > > > > > I915_MOCS_PTE and wasn't just using an implicit 
> > > > > > > > > 'unused_entries' lookup
> > > > > > > > > for MOCS[1].  The new table is the same as the TGL table, 
> > > > > > > > > just with
> > > > > > > > > I915_MOCS_PTE (1) removed.
> > > > > > > > 
> > > > > > > > And just how are people planning on handling display 
> > > > > > > > cacheability
> > > > > > > > control without a PTE MOCS entry? Is Mesa/etc. already making 
> > > > > > > > all
> > > > > > > > external bos uncached on these platforms just in case we might
> > > > > > > > scan out said bo?
> > > > > > > 
> > > > > > > MOCS entry 1 has never been considered a valid MOCS table entry 
> > > > > > > on gen12
> > > > > > > platforms (despite the old #define, it's not actually related to 
> > > > > > > PTE,
> > > > > > > display, etc. anymore).
> > > > > > 
> > > > > > So can someone finally explain to me how we're supposed to cache
> > > > > > anything that might become a scanout buffer later (eg. window system
> > > > > > buffers)? Or are we just making everything like that UC now, and is
> > > > > > everyone happy with that? Is userspace actually following that?
> > > > > 
> > > > > Table entry #1 has never had anything to do with scanout on gen12+.  I
> > > > > would assume that UMDs are either using the display entry in the MOCS
> > > > > table (which is 61 on gen12+) or some other UC entry.
> > > > 
> > > > If 61 is meant to to be the new PTE entry wy hasn't it been defines as
> > > > such in the code? And I know for a fact that userspace (Mesa) is not
> > > 
> > > There is no "PTE" entry anymore.  But 61 is already documented as
> > > "displayable" in both the spec and the code:
> > > 
> > > /* HW Special Case (Displayable) */   
> > >
> > > MOCS_ENTRY(61, 
> > 
> > Why is it called a "HW special case"? I don't think there's any hw
> > magic in there?
> > 
> > And why aren't we setting it to PTE to get some cacheability for
> > window back buffers and such?
> 
> Who is "we" here?

We who care about the performance of the system.

> The MOCS table is a pre-defined set of per-platform
> magic numbers.  The software teams don't get to decide what the values
> are, we just program the hardware with the per-platform numbers that
> have been agree

Re: [Mesa-dev] [PATCH] intel: Add more PCI Device IDs for Coffee Lake and Ice Lake.

2019-02-20 Thread Ville Syrjälä
On Tue, Feb 19, 2019 at 09:36:14AM -0800, Rodrigo Vivi wrote:
> On Mon, Feb 18, 2019 at 04:54:34PM +1100, Jonathan Gray wrote:
> > Compared to linux and libdrm Mesa is missing a VLV and ICL id.
> > 
> > 0x0f30
> > ff049b6ce21d2814451afd4a116d001712e0116b
> > drm/i915: bind driver to ValleyView chipsets
> > 
> > 0x8A70
> > d55cb4fa2cf0105bfb16b60a2846737b91fdc173
> > drm/i915/icl: Add the ICL PCI IDs
> > 
> > The Intel Media SDK describes these as
> > 
> > /* VLV */
> > { 0x0f30, MFX_HW_VLV, MFX_GT1 },   /* VLV mobile */
> 
> hmmm... no idea about this one...
> Ville?
> maybe we should just remove from kernel since it was never
> missed from Mesa?

Bspec says that is the infamous X0. Assuming it's not
lying to us it should be safe to remove.

> 
> > 
> > /* ICL LP */
> > { 0x8A70, MFX_HW_ICL_LP, MFX_GT1 }
> 
> This is pure display so no Mesa needed for this ID.
> 
> > 
> > and libdrm's intel_chipset.h describes the VLV id as
> > 
> > #define PCI_CHIP_VALLEYVIEW_PO  0x0f30 /* VLV PO board */
> > 
> > It isn't clear what the ICL configuration should be from public
> > information.
> > 
> > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> > index b91abd7a3f9..3568007b1ef 100644
> > --- a/include/pci_ids/i965_pci_ids.h
> > +++ b/include/pci_ids/i965_pci_ids.h
> > @@ -86,6 +86,7 @@ CHIPSET(0x0D2B, hsw_gt3, "Intel(R) Haswell")
> >  CHIPSET(0x0D0E, hsw_gt1, "Intel(R) Haswell")
> >  CHIPSET(0x0D1E, hsw_gt2, "Intel(R) Haswell")
> >  CHIPSET(0x0D2E, hsw_gt3, "Intel(R) Haswell")
> > +CHIPSET(0x0F30, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F31, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F32, byt, "Intel(R) Bay Trail")
> >  CHIPSET(0x0F33, byt, "Intel(R) Bay Trail")
> > @@ -212,4 +213,5 @@ CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice 
> > Lake 6x8 GT1.5)")
> >  CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> >  CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
> >  CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> > +CHIPSET(0x8A70, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
> >  CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")

-- 
Ville Syrjälä
Intel
-
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] i915/swrast vertex array regression

2018-08-20 Thread Ville Syrjälä
On Fri, Aug 10, 2018 at 01:13:31PM +0200, Mathias Fröhlich wrote:
> Hi Ville,
> 
> > Looks like
> > "vid_gl20" "0"
> > is needed to reproduce the issue.
> 
> Thanks! Now I can observe the problem!
> And the attached patch seems to fix what I can observe here.
> Does the attached patch also fix the problem with the
> older intel chip you originally reported?

Seems to. Thanks for tracking it down.

Tested-by: Ville Syrjälä 

> 
> Thanks!
> 
> Mathias

> From 68e921478c8c38a13b7258e0d3f1f235709dcfe9 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Mathias=20Fr=C3=B6hlich?= 
> Date: Fri, 10 Aug 2018 11:37:43 +0200
> Subject: [PATCH] tnl: Fix green gun regression in xonotic.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Signed-off-by: Mathias Fröhlich 
> ---
>  src/mesa/tnl/t_split_copy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/tnl/t_split_copy.c b/src/mesa/tnl/t_split_copy.c
> index cbb7eb409f..085ae9a28c 100644
> --- a/src/mesa/tnl/t_split_copy.c
> +++ b/src/mesa/tnl/t_split_copy.c
> @@ -531,7 +531,7 @@ replay_init(struct copy_context *copy)
> for (offset = 0, i = 0; i < copy->nr_varying; i++) {
>const struct tnl_vertex_array *src = copy->varying[i].array;
>const struct gl_array_attributes *srcattr = src->VertexAttrib;
> -  struct tnl_vertex_array *dst = >dstarray[i];
> +  struct tnl_vertex_array *dst = >dstarray[copy->varying[i].attr];
>struct gl_vertex_buffer_binding *dstbind = 
> >varying[i].dstbinding;
>struct gl_array_attributes *dstattr = >varying[i].dstattribs;
>  
> -- 
> 2.17.1
> 


-- 
Ville Syrjälä
Intel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i915/swrast vertex array regression

2018-07-31 Thread Ville Syrjälä
On Tue, Jul 31, 2018 at 08:29:18AM +0200, Mathias Fröhlich wrote:
> Hi Ville,
> 
> > I noticed a while back that xonotic had started to misrender the gun
> > models on i915. Yesterday I bisected it down to commit 64d2a2048054
> > ("mesa: Make gl_vertex_array contain pointers to first order VAO
> > members."). Actually that commit broke things even worse (and the
> > game would even crash after a while), but a later commit (presumably
> > 98f35ad63c23 ("vbo: Correctly handle source arrays in vbo_split_copy.")
> > fixed things a little bit. But even with current master the guns are
> > still being misrendered. I also verified that the same breakage was
> > visible with swrast, whereas llvmpipe and i965 seemed ok.
> > 
> > To reproduce you can run 'xonotic-glx -benchmark 
> > demos/the-big-keybench.dem' 
> > and wait until the guy picks up the mortar (I think that's what its
> > called) which is maybe the third gun he picks up in that demo. The gun 
> > is supposed to have some red color on it but now it has green.
> 
> I tried to reproduce that problem using classic swrast.
> Some variants of last weeks git master show the exact same
> pictures than  swrast using the last good commit from your bisect.
> So, basically I can't reproduce the problem here.

Looks like
"vid_gl20" "0"
is needed to reproduce the issue.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] i915/swrast vertex array regression

2018-07-23 Thread Ville Syrjälä
Hi Mathias,

I noticed a while back that xonotic had started to misrender the gun
models on i915. Yesterday I bisected it down to commit 64d2a2048054
("mesa: Make gl_vertex_array contain pointers to first order VAO
members."). Actually that commit broke things even worse (and the
game would even crash after a while), but a later commit (presumably
98f35ad63c23 ("vbo: Correctly handle source arrays in vbo_split_copy.")
fixed things a little bit. But even with current master the guns are
still being misrendered. I also verified that the same breakage was
visible with swrast, whereas llvmpipe and i965 seemed ok.

To reproduce you can run 'xonotic-glx -benchmark demos/the-big-keybench.dem' 
and wait until the guy picks up the mortar (I think that's what its
called) which is maybe the third gun he picks up in that demo. The gun 
is supposed to have some red color on it but now it has green.

It looked like there's already a bunch of stuff piled on top of the
regression so reverting didn't seem entirely trivial, and thus I didn't
bother trying.

-- 
Ville Syrjälä
Intel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] wsi_common_display: Deal with vscan values

2018-06-15 Thread Ville Syrjälä
On Thu, Jun 14, 2018 at 05:57:01PM -0700, Keith Packard wrote:
> We sorted out what 'vscan' means and are trying to use it correctly.
> 
> vscan = 0 is the same as vscan = 1, which is slightly annoying; we use
> MAX2(vscan, 1) everywhere.
> 
> randr doesn't pass vscan at all, so we set wsi mode vscan = 0.
> 
> The doublescan flag doubles the vscan value, so we don't need to deal
> with that separately, we can just compare flags normally.
> 
> Signed-off-by: Keith Packard 
> ---
>  src/vulkan/wsi/wsi_common_display.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/src/vulkan/wsi/wsi_common_display.c 
> b/src/vulkan/wsi/wsi_common_display.c
> index c7f794a0eff..de1c1826bd2 100644
> --- a/src/vulkan/wsi/wsi_common_display.c
> +++ b/src/vulkan/wsi/wsi_common_display.c
> @@ -149,7 +149,7 @@ wsi_display_mode_matches_drm(wsi_display_mode *wsi,
>wsi->vsync_start == drm->vsync_start &&
>wsi->vsync_end == drm->vsync_end &&
>wsi->vtotal == drm->vtotal &&
> -  wsi->vscan == drm->vscan &&
> +  MAX2(wsi->vscan, 1) == MAX2(drm->vscan, 1) &&
>wsi->flags == drm->flags;
>  }
>  
> @@ -158,7 +158,7 @@ wsi_display_mode_refresh(struct wsi_display_mode *wsi)
>  {
> return (double) wsi->clock * 1000.0 / ((double) wsi->htotal *
>(double) wsi->vtotal *
> -  (double) (wsi->vscan + 1));
> +  (double) MAX2(wsi->vscan, 1));

Are you dealing with INTERLACE anywhere? The kernel generally operates
under the assumption that vrefresh == field rate.

>  }
>  
>  static uint64_t wsi_get_current_monotonic(void)
> @@ -1657,6 +1657,7 @@ wsi_display_mode_matches_x(struct wsi_display_mode *wsi,
>wsi->vsync_start == xcb->vsync_start &&
>wsi->vsync_end == xcb->vsync_end &&
>wsi->vtotal == xcb->vtotal &&
> +  wsi->vscan <= 1 && 
>wsi->flags == xcb->mode_flags;
>  }
>  
> @@ -1707,8 +1708,6 @@ wsi_display_register_x_mode(struct wsi_device 
> *wsi_device,
> display_mode->vsync_end = x_mode->vsync_end;
> display_mode->vtotal = x_mode->vtotal;
> display_mode->vscan = 0;
> -   if (x_mode->mode_flags & XCB_RANDR_MODE_FLAG_DOUBLE_SCAN)
> -  display_mode->vscan = 1;
> display_mode->flags = x_mode->mode_flags;
>  
> list_addtail(_mode->list, >display_modes);
> -- 
> 2.17.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Ville Syrjälä
Intel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] meson: Fix build for egl platform_x11 with dri3

2018-05-11 Thread Ville Syrjälä
On Fri, May 11, 2018 at 03:26:57PM +0100, Eric Engestrom wrote:
> On Monday, 2018-05-07 20:05:40 +0300, Ville Syrjälä wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > platform_x11 with dri3 needs inc_loader.
> > 
> > In file included from ../src/egl/drivers/dri2/platform_x11_dri3.c:35:0:
> > ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: 
> > loader_dri3_helper.h: No such file or directory
> > In file included from ../src/egl/drivers/dri2/platform_x11.c:46:0:
> > ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: 
> > loader_dri3_helper.h: No such file or directory
> > In file included from ../src/egl/drivers/dri2/egl_dri2.c:61:0:
> > ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: 
> > loader_dri3_helper.h: No such file or directory
> > 
> > Cc: Dylan Baker <dy...@pnwbakers.com>
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> 
> Reviewed-by: Eric Engestrom <eric.engest...@intel.com>

Thanks. Pushed to master.

> 
> > ---
> >  src/egl/meson.build | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/src/egl/meson.build b/src/egl/meson.build
> > index 6537e4bdee61..9050d763a6cd 100644
> > --- a/src/egl/meson.build
> > +++ b/src/egl/meson.build
> > @@ -102,6 +102,7 @@ if with_platform_x11
> >if with_dri3
> >  files_egl += files('drivers/dri2/platform_x11_dri3.c')
> >  link_for_egl += libloader_dri3_helper
> > +incs_for_egl += inc_loader
> >endif
> >deps_for_egl += [dep_x11_xcb, dep_xcb_dri2, dep_xcb_xfixes]
> >  endif

-- 
Ville Syrjälä
Intel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/16] anv/image: Separate modifiers from legacy scanout

2018-02-14 Thread Ville Syrjälä
On Wed, Feb 14, 2018 at 04:43:15PM +, Daniel Stone wrote:
> On 14 February 2018 at 16:27, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> > On Wed, Feb 14, 2018 at 3:39 AM, Daniel Stone <dan...@fooishbar.org> wrote:
> >> > On Tue, Feb 13, 2018 at 10:48 AM, Daniel Stone <dan...@fooishbar.org> 
> >> > wrote:
> >> >> For actual scanout, the kernel very specifically demands that if the
> >> >> BO is X-tiled, then set_tiling be called with TILING_X. This applies
> >> >> even if we explicitly allocate with MOD_X_TILED and pass that in via
> >> >> drmModeAddFB2WithModifiers: there is an explicit check for
> >> >> !!(bo_tiling == TILING_X) == !!(modifier == MOD_X_TILED).
> >>
> >> You missed this bit. Suggested fixup: https://hastebin.com/xikopobiza
> >
> > I was really hoping I'd read wrong.  I'm going with "that's a kernel bug".

Old kernels required TILING_X with MOD_X. I changed that at some
point to allow TILING_NONE with any modifier, but otherwise we
require the BO tiling to match the modifier. Ie. you still can't
mix TILING_Y + MOD_X for example.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] Wayland/Weston/Mesa HDR support (proof of concept)

2017-12-21 Thread Ville Syrjälä
Here's a quick proof of concept implementation of HDR support
for Wayland/Weston/Mesa.

I'm not posting this as patches right now because I'm not sure
that would do much good given how rough this is. But here are
all the repos/branches:
git://github.com/vsyrjala/wayland.git hdr_poc
git://github.com/vsyrjala/wayland-protocols.git hdr_poc
git://github.com/vsyrjala/weston.git hdr_poc
git://github.com/vsyrjala/mesa.git hdr_poc
git://github.com/vsyrjala/linux.git hdr_poc

The kernel HDR bits were partially done by Uma Shankar, the rest
I hacked together myself.

As far as Wayland protocol goes I'm adding three new
extensions (should probably just have one with several requests?):
- zwp_colorspace_v1 - Specify the primaries/whitepoint chromacities 
  and transfer function for a surface
- zwp_ycbcr_encoding_v1 - Specify the encoding for YCbCr surfaces
- zwp_hdr_metadata_v1 - Allow the client to pass HDR metadata to 
  the compositor
Note that I've not given any thought to how the compositor might
advertize its capabilities.

I also hacked in a bunch of 10bpc+ YCbCr support to the protocol and
Weston so that I can actually get some HDR video onto the screen.

On the Mesa side I've crudely implementated the following egl/vk
extesions:
- EXT_gl_colorspace_*
- EXT_surface_SMPTE2086_metadata
- EXT_surface_CTA861_3_metadata
  (sidenote: these egl extension don't seem to match CTA-861.3 nicely
   when it comes to the min/max luminance stuff)
- VK_EXT_hdr_metadata

VK_EXT_hdr_metadata I plugged in for anv only, but the implementation
is in the common wayland wsi code. Note that I haven't actually tested
the vulkan stuff at all because I don't talk Vulkan (at least not yet).

Also note that I've not connected up the HDR metadata pipeline 
properly. The client can provide the metadata, but the compositor
doesn't actually pass it on to the display. For the time being the
HDR metadata that gets passed to the display is partially specified
in weston.ini and partially just hardocded (see
libweston/compositor-drm.c).

The Weston implementation involves a bunch of shaders and matrices to
do the ycbcr->rgb, "degamma", csc for each surface, blend it all as
linear RGB into an fp16 fbo, and finally blit that out to the final
framebuffer while applying the ST2084 PQ transfer function in the
process.

The reason for the fp16 fbo is that we store the full 1 nits of
linear RGB. That needs plenty of precisions in the low end so your
regular 10bpc fb doesn't seem to cut it. And also the display gamma LUT
doesn't have enough input precision for it either. Sadly there's no
fixed function hardware in the GPU to do the ST2084 PQ when blending.
When the output is not HDR I do skip the fp16 fbo step and use the
gamma LUT in the display engine instead. 

Another approach to the precisions problem might be to not store the
entire 1 nits of linear, and just cut off the super bright stuff
(your display can't show it anyway). But I've not really bothered to
figure out how low in nits we'd have to go here, probably too low.
Maybe blending as sRGB and the doing sRGB->PQ with the gamma LUT might
help a little bit?

Ideally we would bypass this all for a single fullscreen HDR surface
and just pass the PQ encoded data directly through. But I've not
implemented that. In fact I even disable the buffer_age damage stuff
when using the fp16 fbo, so we'll recompose the entire screen every
time. Yeah, I'm lazy.

Another thought that occurred to me was that it shouldn't be too hard
to make Weston do some tone mapping when there's a HDR client and no
HDR screen. To that end I included the ACES colorspaces in my
colorspace list, but I didn't actually look into plugging the ACES tone
mapping curve into the shaders. Might be a fun excercise, even though
the practical applications might be close to nil. Probably better to
not advertize HDR/wide gamuts when we can't actually output the stuff,
and instead let the client do its own tone mapping.

OK, so what can you do with this? I've included a few test clients:
- simple-colorspace
  Just a copy of simple-egl but it uses the egl extension to specify
  the colorspace, and produces ST2084 PQ encoded data when asked
- simple-hdr-video
  Uses ffmpeg to decode video into shm buffers, and sets the 
  colorspace/ycbcr encoding etc. appropriately. Ie. this one can
  actually output HDR video

Here's a weston.ini snippet that gets you outputting HDR:
[core]
gbm-format=xrgb2101010

[output]
name=HDMI-A-2
colorspace=BT.2020
gamma=ST2084
max_sdr_nits=100

Hardware wise you'll need a HDR capable display obviously, and 
you'll need an Intel Geminilake GPU. Older Intel platforms don't
support the HDR infoframe, so the display wouldn't know what to do
with the data you're feeding it.

As for the future, right now I don't really have any solid plans on
continuing to develop this. I might dabble with it a bit more out of
curiosity, but I'm more hoping we can find other people to move this
forward properly.

-

Re: [Mesa-dev] [PATCH] i965: Disable L3 cache allocation for external buffers

2017-11-01 Thread Ville Syrjälä
;.offset = mt->offset,
> >.reloc_flags = is_render_target ? EXEC_OBJECT_WRITE : 0,
> > };
> > +   surf->external = mt->bo->external;
> >  
> > surf->aux_usage = aux_usage;
> >  
> > diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> > b/src/mesa/drivers/dri/i965/brw_state.h
> > index 8db354cf23..01c0cd12cb 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state.h
> > +++ b/src/mesa/drivers/dri/i965/brw_state.h
> > @@ -342,6 +342,7 @@ void gen10_init_atoms(struct brw_context *brw);
> >   * may still respect that.
> >   */
> >  #define GEN7_MOCS_L31
> > +#define GEN7_MOCS_PTE   0
> 
> I think we want to keep the current bitfield (it obviously works with
> uncached scanout buffers), but with a big comment:
> 
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index 8db354cf232d..4e418f5e8ea5 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -342,6 +342,12 @@ void gen10_init_atoms(struct brw_context *brw);
>   * may still respect that.
>   */
>  #define GEN7_MOCS_L31
> +/* Even when we obey the kernel's cacheability setting from the PTE we still
> + * want to explicitly enable L3 caching. L3$ seems to be automatically 
> disabled
> + * when the PTE request uncached mode - the kernel never explicitly flushed 
> L3
> + * on gen7.
> + */

I doubt the the LLC caheability control has any effect on L3 (except on
CHV but that one doesn't even have LLC, and I think they just messed up
the bits for fun). I can't remmeber if I ever properly tested the
L3+LLC=UC combination on other platforms though.

Also there's no way to flush L3. You can flush/invalidate caches that
may or may not be backed by L3. And we do seem to flush/invalidate all
of them. IIRC we were missing the data cache flush bit from the
PIPE_CONTROL at some point, but now it seems to be there. Also I
doubt the data cache would be involved in your typical render
target accesses.

The spec does say that if something gets evicted from L3 it may end up
in LLC even if LLC cacheability is UC. I assume that must be talking
about the data cache since IIRC that is the only RW cache backed by L3,
and I can't see why evicting something from a RO cache would end up
anywhere.

So if we don't have to worry about the eviction issue I think always
using L3 is the correct answer.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Ville Syrjälä
On Tue, Oct 17, 2017 at 09:00:56PM +0200, Nicolai Hähnle wrote:
> On 17.10.2017 16:09, Ville Syrjälä wrote:
> > On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
> >> On 17/10/17 02:22 PM, Daniel Vetter wrote:
> >>> On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> >>>> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> >>>
> >>>>> Common sense suggests that there need to be two side to FreeSync / VESA
> >>>>> Adaptive Sync support:
> >>>>>
> >>>>> 1. Query the display capabilities. This means querying minimum / maximum
> >>>>> refresh duration, plus possibly a query for when the earliest/latest
> >>>>> timing of the *next* refresh.
> >>>>>
> >>>>> 2. Signal desired present time. This means passing a target timer value
> >>>>> instead of a target vblank count, e.g. something like this for the KMS
> >>>>> interface:
> >>>>>
> >>>>>    int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
> >>>>>    uint32_t flags, void *user_data,
> >>>>>    uint64_t target);
> >>>>>
> >>>>>    + a flag to indicate whether target is the vblank count or the
> >>>>> CLOCK_MONOTONIC (?) time in ns.
> >>>>
> >>>> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
> >>>> sync should probably only be supported via the atomic API, presumably
> >>>> via output properties.
> >>>
> >>> +1
> >>>
> >>> At least now that DC is on track to land properly, and you want to do this
> >>> for DC-only anyway there's no reason to pimp the legacy interfaces
> >>> further. And atomic is soo much easier to extend.
> >>>
> >>> The big question imo is where we need to put the flag on the kms side,
> >>> since freesync is not just about presenting earlier, but also about
> >>> presenting later. But for backwards compat we can't stretch the refresh
> >>> rate by default for everyone, or clients that rely on high precision
> >>> timestamps and regular refresh will get a bad surprise.
> >>
> >> The idea described above is that adaptive sync would be used for flips
> >> with a target timestamp. Apps which don't want to use adaptive sync
> >> wouldn't set a target timestamp.
> >>
> >>
> >>> I think a boolean enable_freesync property is probably what we want, which
> >>> enables freesync for as long as it's set.
> >>
> >> The question then becomes under what circumstances the property is (not)
> >> set. Not sure offhand this will actually solve any problem, or just push
> >> it somewhere else.
> >>
> >>
> >>> Finally I'm not sure we want to insist on a target time for freesync. At
> >>> least as far as I understand things you just want "as soon as possible".
> >>> This might change with some of the VK/EGL/GLX extensions where you
> >>> specify a precise timing (media playback). But that needs a bit more work
> >>> to make it happen I think, so perhaps better to postpone.
> >>
> >> I don't see why. There's an obvious use case for this now, for video
> >> playback. At least VDPAU already has target timestamps for this.
> >>
> >>
> >>> Also note that right now no driver expect amdgpu has support for a target
> >>> vblank on a flip. That's imo another reason for not requiring target
> >>> support for at least basic freesync support.
> >>
> >> I think that's a bad reason. :) Adding it for atomic drivers shouldn't
> >> be that hard.
> > 
> > Apart from the actual implementation hurdles it does open up some new 
> > questions:
> 
> All good questions, thanks! Let me try to take a crack at them:
> 
> 
> > - Is it going to be per-plane or per-crtc?
> 
> My understanding is that planes are combined to form a single signal 
> that goes out to the monitor(s). The planes are scanned out together by 
> a crtc, so it should be per-crtc.

I guess one might imagine a compositor with one video player type of
client, and another game/benchmark type of client. If both clients queue
their next frames around the same time, the compositor might think to
combine them to a single atomic ioctl call. But it's possible the
video player client would want i

Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync

2017-10-17 Thread Ville Syrjälä
On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote:
> On 17/10/17 02:22 PM, Daniel Vetter wrote:
> > On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote:
> >> On 17/10/17 11:34 AM, Nicolai Hähnle wrote:
> > 
> >>> Common sense suggests that there need to be two side to FreeSync / VESA
> >>> Adaptive Sync support:
> >>>
> >>> 1. Query the display capabilities. This means querying minimum / maximum
> >>> refresh duration, plus possibly a query for when the earliest/latest
> >>> timing of the *next* refresh.
> >>>
> >>> 2. Signal desired present time. This means passing a target timer value
> >>> instead of a target vblank count, e.g. something like this for the KMS
> >>> interface:
> >>>
> >>>   int drmModePageFlipTarget64(int fd, uint32_t crtc_id, uint32_t fb_id,
> >>>   uint32_t flags, void *user_data,
> >>>   uint64_t target);
> >>>
> >>>   + a flag to indicate whether target is the vblank count or the
> >>> CLOCK_MONOTONIC (?) time in ns.
> >>
> >> drmModePageFlip(Target) is part of the pre-atomic KMS API, but adapative
> >> sync should probably only be supported via the atomic API, presumably
> >> via output properties.
> > 
> > +1
> > 
> > At least now that DC is on track to land properly, and you want to do this
> > for DC-only anyway there's no reason to pimp the legacy interfaces
> > further. And atomic is soo much easier to extend.
> > 
> > The big question imo is where we need to put the flag on the kms side,
> > since freesync is not just about presenting earlier, but also about
> > presenting later. But for backwards compat we can't stretch the refresh
> > rate by default for everyone, or clients that rely on high precision
> > timestamps and regular refresh will get a bad surprise.
> 
> The idea described above is that adaptive sync would be used for flips
> with a target timestamp. Apps which don't want to use adaptive sync
> wouldn't set a target timestamp.
> 
> 
> > I think a boolean enable_freesync property is probably what we want, which
> > enables freesync for as long as it's set.
> 
> The question then becomes under what circumstances the property is (not)
> set. Not sure offhand this will actually solve any problem, or just push
> it somewhere else.
> 
> 
> > Finally I'm not sure we want to insist on a target time for freesync. At
> > least as far as I understand things you just want "as soon as possible".
> > This might change with some of the VK/EGL/GLX extensions where you
> > specify a precise timing (media playback). But that needs a bit more work
> > to make it happen I think, so perhaps better to postpone.
> 
> I don't see why. There's an obvious use case for this now, for video
> playback. At least VDPAU already has target timestamps for this.
> 
> 
> > Also note that right now no driver expect amdgpu has support for a target
> > vblank on a flip. That's imo another reason for not requiring target
> > support for at least basic freesync support.
> 
> I think that's a bad reason. :) Adding it for atomic drivers shouldn't
> be that hard.

Apart from the actual implementation hurdles it does open up some new questions:
- Is it going to be per-plane or per-crtc?
- What happens if the target timestamp is already stale?
- What happens if the target timestamp is good when it gets scheduled,
  but can't be met once the fences and whatnot have signalled?
- What happens if another operation is already queued with a more
  recent timestamp?
- Apart from a pure timestamp do we want to move the OML_sync/swap_whatever
  msc remainder etc. semantics into the kernel as well? It's just
  another way to specify the target flip time after all.

I do like the idea, but clearly there's a bit of thought require to
make sure the semantics are good.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] meson: Build i915

2017-10-13 Thread Ville Syrjälä
On Thu, Oct 12, 2017 at 01:51:24PM -0700, Dylan Baker wrote:
> Both patches are:
> Reviewed-by: Dylan Baker <dy...@pnwbakers.com>

Thanks guys. Pushed.

> 
> Quoting Ville Syrjala (2017-10-12 09:34:55)
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > Build i915 with meson. More or less copied from i965, with all
> > the unneeded cruft removed, and the libdrm_intel dependency added.
> > 
> > Cc: Dylan Baker <dy...@pnwbakers.com>
> > Cc: Eric Anholt <e...@anholt.net>
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  meson.build   |  7 +++
> >  meson_options.txt |  2 +-
> >  src/mesa/drivers/dri/i915/meson.build | 97 
> > +++
> >  src/mesa/drivers/dri/meson.build  |  3 ++
> >  4 files changed, 108 insertions(+), 1 deletion(-)
> >  create mode 100644 src/mesa/drivers/dri/i915/meson.build
> > 
> > diff --git a/meson.build b/meson.build
> > index 4ba00283cec7..02264aeed4ef 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -72,16 +72,23 @@ if (with_gles1 or with_gles2) and not with_opengl
> >  endif
> >  
> >  with_dri = false
> > +with_dri_i915 = false
> >  with_dri_i965 = false
> >  with_dri_swrast = false
> >  _drivers = get_option('dri-drivers')
> >  if _drivers != ''
> >_split = _drivers.split(',')
> > +  with_dri_i915 = _split.contains('i915')
> >with_dri_i965 = _split.contains('i965')
> >with_dri_swrast = _split.contains('swrast')
> >with_dri = true
> >  endif
> >  
> > +dep_libdrm_intel = []
> > +if with_dri_i915
> > +  dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75')
> > +endif
> > +
> >  if not with_dri
> >with_gles1 = false
> >with_gles2 = false
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 029626d69a47..abd5135742ac 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -22,7 +22,7 @@ option('platforms', type : 'string', value : 
> > 'x11,wayland',
> > description : 'comma separated list of window systems to support. 
> > wayland, x11, surfaceless, drm, etc.')
> >  option('dri3', type : 'combo', value : 'auto', choices : ['auto', 'yes', 
> > 'no'],
> > description : 'enable support for dri3')
> > -option('dri-drivers', type : 'string', value : 'swrast,i965',
> > +option('dri-drivers', type : 'string', value : 'swrast,i915,i965',
> > description : 'comma separated list of dri drivers to build.')
> >  option('dri-drivers-path', type : 'string', value : '',
> > description : 'Location of dri drivers. Default: $libdir/dri.')
> > diff --git a/src/mesa/drivers/dri/i915/meson.build 
> > b/src/mesa/drivers/dri/i915/meson.build
> > new file mode 100644
> > index ..1971419a6b71
> > --- /dev/null
> > +++ b/src/mesa/drivers/dri/i915/meson.build
> > @@ -0,0 +1,97 @@
> > +# Copyright © 2017 Intel Corporation
> > +
> > +# Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > +# of this software and associated documentation files (the "Software"), to 
> > deal
> > +# in the Software without restriction, including without limitation the 
> > rights
> > +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> > +# copies of the Software, and to permit persons to whom the Software is
> > +# furnished to do so, subject to the following conditions:
> > +
> > +# The above copyright notice and this permission notice shall be included 
> > in
> > +# all copies or substantial portions of the Software.
> > +
> > +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 
> > THE
> > +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> > FROM,
> > +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> > IN THE
> > +# SOFTWARE.
> > +
> > +files_i915 = files(
> > +  'i830_context.c',
> > +  'i830_context.h',
> > +  'i830_reg.h',
> > +  'i830_state.c',
> > +  'i830_texblend.c',
> > +  'i830_texstate.c',
> > +  'i830_vtbl.c',
> > +

Re: [Mesa-dev] [PATCH] meta: Fix BlitFramebuffer temp texture setup

2017-07-11 Thread Ville Syrjälä
On Mon, Jul 10, 2017 at 11:42:18PM +0300, Andres Gomez wrote:
> Ville, has this patch fallen through the cracks ?

Nope. I've still been looking into the issue whenever I've had a
few minutes to spare. I think I have it more or les figured out
at this stage, but I'll need to respin this patch, and clean up
some further patches to fix this stuff properly for i915.

> 
> On Fri, 2017-06-23 at 14:58 +0300, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > Pass the correct src coordinates to CopyTexSubImage()
> > when creating the temporary texture, and also take care to adjust
> > flipX/Y if the original src coordinates were flipped compared to
> > the new temporary texture src coordinates.
> > 
> > This fixes all the flip_src_x/y tests in
> > piglit.spec.arb_framebuffer_object.fbo-blit-stretch on i915, but
> > we're still left with the some failures in the stretch tests.
> > 
> > It looks to me like commit b702233f53d6 ("meta: Refactor the
> > BlitFramebuffer color CopyTexImage fallback.") most likely
> > broke this codepath.
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > Cc: Eric Anholt <e...@anholt.net>
> > Cc: Kenneth Graunke <kenn...@whitecape.org>
> > Cc: Ian Romanick <ian.d.roman...@intel.com>
> > Cc: Anuj Phogat <anuj.pho...@gmail.com>
> > Fixes: b702233f53d6 ("meta: Refactor the BlitFramebuffer color CopyTexImage 
> > fallback.")
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101414
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  src/mesa/drivers/common/meta_blit.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/common/meta_blit.c 
> > b/src/mesa/drivers/common/meta_blit.c
> > index 7adad469aceb..7262ecdfaf13 100644
> > --- a/src/mesa/drivers/common/meta_blit.c
> > +++ b/src/mesa/drivers/common/meta_blit.c
> > @@ -680,12 +680,16 @@ blitframebuffer_texture(struct gl_context *ctx,
> >}
> >  
> >_mesa_meta_setup_copypix_texture(ctx, meta_temp_texture,
> > -   srcX0, srcY0,
> > +   MIN2(srcX0, srcX1),
> > +   MIN2(srcY0, srcY1),
> > srcW, srcH,
> >     tex_base_format,
> > filter);
> >  
> > -
> > +  if (srcX0 > srcX1)
> > + flipX = -flipX;
> > +  if (srcY0 > srcY1)
> > + flipY = -flipY;
> >srcX0 = 0;
> >srcY0 = 0;
> >srcX1 = srcW;
> -- 
> Br,
> 
> Andres

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: Fix BlitFramebuffer temp texture setup

2017-06-26 Thread Ville Syrjälä
On Fri, Jun 23, 2017 at 06:43:37PM -0700, Ian Romanick wrote:
> On 06/23/2017 04:58 AM, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > Pass the correct src coordinates to CopyTexSubImage()
> > when creating the temporary texture, and also take care to adjust
> > flipX/Y if the original src coordinates were flipped compared to
> > the new temporary texture src coordinates.
> 
> Yes... this seems right.  The calculation of srcW and srcH already take
> care to handle src[XY]0 > src[XY]1, so this make sense.
> 
> Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
> 
> > This fixes all the flip_src_x/y tests in
> > piglit.spec.arb_framebuffer_object.fbo-blit-stretch on i915, but
> > we're still left with the some failures in the stretch tests.
> 
> Any idea what might be wrong with those?

Not really. Some of the reported values were off by just a little bit,
so migth be some precision issue, or maybe the coordinates are off
by a smidge, somehow. But IIRC there was also some bigger errors
reported so not sure. I'll try to take a better look at it at some
point.

> 
> > It looks to me like commit b702233f53d6 ("meta: Refactor the
> > BlitFramebuffer color CopyTexImage fallback.") most likely
> > broke this codepath.
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > Cc: Eric Anholt <e...@anholt.net>
> > Cc: Kenneth Graunke <kenn...@whitecape.org>
> > Cc: Ian Romanick <ian.d.roman...@intel.com>
> > Cc: Anuj Phogat <anuj.pho...@gmail.com>
> > Fixes: b702233f53d6 ("meta: Refactor the BlitFramebuffer color CopyTexImage 
> > fallback.")
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=101414
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  src/mesa/drivers/common/meta_blit.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/common/meta_blit.c 
> > b/src/mesa/drivers/common/meta_blit.c
> > index 7adad469aceb..7262ecdfaf13 100644
> > --- a/src/mesa/drivers/common/meta_blit.c
> > +++ b/src/mesa/drivers/common/meta_blit.c
> > @@ -680,12 +680,16 @@ blitframebuffer_texture(struct gl_context *ctx,
> >}
> >  
> >_mesa_meta_setup_copypix_texture(ctx, meta_temp_texture,
> > -   srcX0, srcY0,
> > +   MIN2(srcX0, srcX1),
> > +   MIN2(srcY0, srcY1),
> > srcW, srcH,
> > tex_base_format,
> > filter);
> >  
> > -
> > +  if (srcX0 > srcX1)
> > + flipX = -flipX;
> > +  if (srcY0 > srcY1)
> > + flipY = -flipY;
> >srcX0 = 0;
> >srcY0 = 0;
> >srcX1 = srcW;

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Fix gl_Fragcoord interpolation

2017-06-22 Thread Ville Syrjälä
On Thu, Jun 22, 2017 at 03:43:36PM +0300, Ville Syrjälä wrote:
> On Wed, Jun 21, 2017 at 03:13:22PM -0700, Ian Romanick wrote:
> > On 06/21/2017 10:38 AM, ville.syrj...@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > 
> > > gl_FragCoord contains the window coordinates so it seems to me that
> > > we should not use perspective correct interpolation for it. At least
> > > now I get similar output as i965/swrast/llvmpipe produce.
> > > 
> > > This fixes dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w.
> > > dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz was already
> > > passing, though I'm not quite sure how it managed to do that.
> > 
> > I suspect all the vertices had the same wrong w value, so the
> > interpolation just worked out.
> 
> IIRC the code looked like it was trying to at least have differing Z
> values on the vertices, but I must admit I didn't check what kind
> of projection it was trying to do. Or maybe I wasn't even reading the
> right piece of code. But anyways, the test still passes so I don't
> think I'll bother worrying about it.
> 
> > 
> > > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > One tiny comment below, but also
> > 
> > Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
> > Cc: mesa-sta...@lists.freedesktop.org
> 
> Ta.
> 
> 
> > > diff --git a/src/mesa/drivers/dri/i915/intel_reg.h 
> > > b/src/mesa/drivers/dri/i915/intel_reg.h
> > > index dabbd612ee04..0e482de84281 100644
> > > --- a/src/mesa/drivers/dri/i915/intel_reg.h
> > > +++ b/src/mesa/drivers/dri/i915/intel_reg.h
> > > @@ -93,7 +93,8 @@
> > >  #define S2_TEX_COUNT_SHIFT_830   12
> > >  #define S2_VERTEX_1_WIDTH_SHIFT_830  0
> > >  #define S2_VERTEX_0_WIDTH_SHIFT_830  6
> > > -/* S3 not interesting */
> > > +
> > > +#define S3_TEXCOORD_PERSPECTIVE_DISABLE(unit)(1<<((unit)*4))
> > 
> > Might be worth adding the "wrap shortest" bits too in case someone ever
> > cares.  I know what they do, but I'm not sure how it's useful.  /shrug/
> 
> Sure, I can toss them in there for completeness.

Added, and pushed to master.

> 
> > 
> > >  
> > >  #define S4_POINT_WIDTH_SHIFT   23
> > >  #define S4_POINT_WIDTH_MASK(0x1ff<<23)
> > > 
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Fix gl_Fragcoord interpolation

2017-06-22 Thread Ville Syrjälä
On Wed, Jun 21, 2017 at 03:13:22PM -0700, Ian Romanick wrote:
> On 06/21/2017 10:38 AM, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > gl_FragCoord contains the window coordinates so it seems to me that
> > we should not use perspective correct interpolation for it. At least
> > now I get similar output as i965/swrast/llvmpipe produce.
> > 
> > This fixes dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w.
> > dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz was already
> > passing, though I'm not quite sure how it managed to do that.
> 
> I suspect all the vertices had the same wrong w value, so the
> interpolation just worked out.

IIRC the code looked like it was trying to at least have differing Z
values on the vertices, but I must admit I didn't check what kind
of projection it was trying to do. Or maybe I wasn't even reading the
right piece of code. But anyways, the test still passes so I don't
think I'll bother worrying about it.

> 
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> 
> One tiny comment below, but also
> 
> Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
> Cc: mesa-sta...@lists.freedesktop.org

Ta.


> > diff --git a/src/mesa/drivers/dri/i915/intel_reg.h 
> > b/src/mesa/drivers/dri/i915/intel_reg.h
> > index dabbd612ee04..0e482de84281 100644
> > --- a/src/mesa/drivers/dri/i915/intel_reg.h
> > +++ b/src/mesa/drivers/dri/i915/intel_reg.h
> > @@ -93,7 +93,8 @@
> >  #define S2_TEX_COUNT_SHIFT_830 12
> >  #define S2_VERTEX_1_WIDTH_SHIFT_8300
> >  #define S2_VERTEX_0_WIDTH_SHIFT_8306
> > -/* S3 not interesting */
> > +
> > +#define S3_TEXCOORD_PERSPECTIVE_DISABLE(unit)  (1<<((unit)*4))
> 
> Might be worth adding the "wrap shortest" bits too in case someone ever
> cares.  I know what they do, but I'm not sure how it's useful.  /shrug/

Sure, I can toss them in there for completeness.

> 
> >  
> >  #define S4_POINT_WIDTH_SHIFT   23
> >  #define S4_POINT_WIDTH_MASK(0x1ff<<23)
> > 

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Always emit W on gen3

2017-06-21 Thread Ville Syrjälä
On Tue, Jun 20, 2017 at 11:00:52AM -0700, Ian Romanick wrote:
> Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>

Thanks. Pushed to master.

> 
> On 06/20/2017 10:22 AM, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > 
> > Unlike the older gen2 hardware, gen3 performs perspective
> > correct interpolation even for the primary/secondary colors.
> > To do that it naturally needs us to emit W for the vertices.
> > 
> > Currently we emit W only when at least one texture coordinate
> > set gets emitted. This means the interpolation of color will
> > change depending on whether texcoords/varyings are used or not.
> > That's probably not what anyone would expect, so let's just
> > always emit W to get consistent behaviour. Trying to avoid
> > emitting W seems like more hassle than it's worth, especially
> > as bspec seems to suggest that the hardware will perform the
> > perspective division anyway.
> > 
> > This used to be broken until it was accidentally fixed it in
> > commit c349031c27b7 ("i915: Fix texcoord vs. varying collision
> > in fragment programs") by introducing a bug that made the driver
> > always emit W. After fixing that bug in commit c1eedb43f32f
> > ("i915: Fix wpos_tex vs. -1 comparison") we went back to the
> > old behaviour and caused an apparent regression.
> > 
> > Fixes: c1eedb43f32f ("i915: Fix wpos_tex vs. -1 comparison")
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101451
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  src/mesa/drivers/dri/i915/i915_fragprog.c | 10 --
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i915/i915_fragprog.c 
> > b/src/mesa/drivers/dri/i915/i915_fragprog.c
> > index e19ca60bd334..3657b2d82565 100644
> > --- a/src/mesa/drivers/dri/i915/i915_fragprog.c
> > +++ b/src/mesa/drivers/dri/i915/i915_fragprog.c
> > @@ -1252,12 +1252,10 @@ i915ValidateFragmentProgram(struct i915_context 
> > *i915)
> > intel->coloroffset = 0;
> > intel->specoffset = 0;
> >  
> > -   if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != 
> > I915_WPOS_TEX_INVALID) {
> > -  EMIT_ATTR(_TNL_ATTRIB_POS, EMIT_4F_VIEWPORT, S4_VFMT_XYZW, 16);
> > -   }
> > -   else {
> > -  EMIT_ATTR(_TNL_ATTRIB_POS, EMIT_3F_VIEWPORT, S4_VFMT_XYZ, 12);
> > -   }
> > +   /* Always emit W to get consistent perspective
> > +* correct interpolation of primary/secondary colors.
> > +*/
> > +   EMIT_ATTR(_TNL_ATTRIB_POS, EMIT_4F_VIEWPORT, S4_VFMT_XYZW, 16);
> >  
> > /* Handle gl_PointSize builtin var here */
> > if (ctx->Point._Attenuated || ctx->VertexProgram.PointSizeEnabled)
> > 

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/13] i915: On Gen <= 3 there is no W-tiling

2017-06-20 Thread Ville Syrjälä
On Sun, Jun 18, 2017 at 07:07:51PM -0700, Ian Romanick wrote:
> From: Ian Romanick <ian.d.roman...@intel.com>

I have the same patch in my tree :)

For the i915 parts of the series:
Reviewed-by: Ville Syrjälä <ville.syrj...@linux.intel.com>

> 
> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
> ---
>  src/mesa/drivers/dri/i915/intel_mipmap_tree.c |  5 ++---
>  src/mesa/drivers/dri/i915/intel_regions.c | 21 ++---
>  src/mesa/drivers/dri/i915/intel_regions.h |  5 ++---
>  src/mesa/drivers/dri/i915/intel_screen.c  |  7 +++
>  4 files changed, 9 insertions(+), 29 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i915/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
> index 6c0f55b..bb6166e 100644
> --- a/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i915/intel_mipmap_tree.c
> @@ -525,14 +525,13 @@ intel_miptree_get_tile_offsets(struct intel_mipmap_tree 
> *mt,
> uint32_t x, y;
> uint32_t mask_x, mask_y;
>  
> -   intel_region_get_tile_masks(region, _x, _y, false);
> +   intel_region_get_tile_masks(region, _x, _y);
> intel_miptree_get_image_offset(mt, level, slice, , );
>  
> *tile_x = x & mask_x;
> *tile_y = y & mask_y;
>  
> -   return intel_region_get_aligned_offset(region, x & ~mask_x, y & ~mask_y,
> -  false);
> +   return intel_region_get_aligned_offset(region, x & ~mask_x, y & ~mask_y);
>  }
>  
>  static void
> diff --git a/src/mesa/drivers/dri/i915/intel_regions.c 
> b/src/mesa/drivers/dri/i915/intel_regions.c
> index c9b776d..be0dca4 100644
> --- a/src/mesa/drivers/dri/i915/intel_regions.c
> +++ b/src/mesa/drivers/dri/i915/intel_regions.c
> @@ -284,15 +284,11 @@ intel_region_release(struct intel_region 
> **region_handle)
>   */
>  void
>  intel_region_get_tile_masks(struct intel_region *region,
> -uint32_t *mask_x, uint32_t *mask_y,
> -bool map_stencil_as_y_tiled)
> +uint32_t *mask_x, uint32_t *mask_y)
>  {
> int cpp = region->cpp;
> uint32_t tiling = region->tiling;
>  
> -   if (map_stencil_as_y_tiled)
> -  tiling = I915_TILING_Y;
> -
> switch (tiling) {
> default:
>assert(false);
> @@ -317,25 +313,12 @@ intel_region_get_tile_masks(struct intel_region *region,
>   */
>  uint32_t
>  intel_region_get_aligned_offset(struct intel_region *region, uint32_t x,
> -uint32_t y, bool map_stencil_as_y_tiled)
> +uint32_t y)
>  {
> int cpp = region->cpp;
> uint32_t pitch = region->pitch;
> uint32_t tiling = region->tiling;
>  
> -   if (map_stencil_as_y_tiled) {
> -  tiling = I915_TILING_Y;
> -
> -  /* When mapping a W-tiled stencil buffer as Y-tiled, each 64-high 
> W-tile
> -   * gets transformed into a 32-high Y-tile.  Accordingly, the pitch of
> -   * the resulting region is twice the pitch of the original region, 
> since
> -   * each row in the Y-tiled view corresponds to two rows in the actual
> -   * W-tiled surface.  So we need to correct the pitch before computing
> -   * the offsets.
> -   */
> -  pitch *= 2;
> -   }
> -
> switch (tiling) {
> default:
>assert(false);
> diff --git a/src/mesa/drivers/dri/i915/intel_regions.h 
> b/src/mesa/drivers/dri/i915/intel_regions.h
> index 562f7cd..05375f1 100644
> --- a/src/mesa/drivers/dri/i915/intel_regions.h
> +++ b/src/mesa/drivers/dri/i915/intel_regions.h
> @@ -101,12 +101,11 @@ void intel_recreate_static_regions(struct intel_context 
> *intel);
>  
>  void
>  intel_region_get_tile_masks(struct intel_region *region,
> -uint32_t *mask_x, uint32_t *mask_y,
> -bool map_stencil_as_y_tiled);
> +uint32_t *mask_x, uint32_t *mask_y);
>  
>  uint32_t
>  intel_region_get_aligned_offset(struct intel_region *region, uint32_t x,
> -uint32_t y, bool map_stencil_as_y_tiled);
> +uint32_t y);
>  
>  /**
>   * Used with images created with image_from_names
> diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
> b/src/mesa/drivers/dri/i915/intel_screen.c
> index cba5434..fe8ece78f 100644
> --- a/src/mesa/drivers/dri/i915/intel_screen.c
> +++ b/src/mesa/drivers/dri/i915/intel_screen.c
> @@ -278,7 +278,7 @@ intel_setup_image_from_mipmap_tree(struct intel_context 
> *intel, __DRIimage *imag
>  
> intel_miptree_check_level_layer(mt, level, zoffset

Re: [Mesa-dev] [Mesa-stable] [PATCH] i915: Fix wpos_tex vs. -1 comparison

2017-06-14 Thread Ville Syrjälä
On Wed, Jun 14, 2017 at 11:45:17AM +0100, Emil Velikov wrote:
> Hi Ville
> 
> On 5 June 2017 at 15:52,  <ville.syrj...@linux.intel.com> wrote:
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> >
> > wpos_tex used to be a GLuint so assinging -1 to it and
> > later comparing with -1 worked correctly, but commit
> > c349031c27b7 ("i915: Fix texcoord vs. varying collision in
> > fragment programs") changed wpos_tex to uint8_t and hence
> > broke the comparison. To fix this define a more explicit
> > invalid value for wpos_tex.
> >
> s/assinging/assigning/

Fixed.

> 
> > gcc warns us:
> > i915_fragprog.c:1255:57: warning: comparison is always true due to limited 
> > range of data type [-Wtype-limits]
> > if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
> >  ^
> >
> > And clang says:
> > i915_fragprog.c:1255:57: warning: comparison of constant -1 with expression 
> > of type 'uint8_t' (aka 'unsigned char') is always true 
> > [-Wtautological-constant-out-of-range-compare]
> >if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
> > ~~~ ^  ~~
> >
> > Cc: Chih-Wei Huang <cwhu...@android-x86.org>
> > Cc: Eric Anholt <e...@anholt.net>
> > Cc: Ian Romanick <ian.d.roman...@intel.com>
> > Cc: mesa-sta...@lists.freedesktop.org
> > Fixes: c349031c27b7 ("i915: Fix texcoord vs. varying collision in fragment 
> > programs")
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> Reviewed-by: Emil Velikov <emil.veli...@collabora.com>

Thanks. Pushed to master.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/device: Remove a use of a compound literal

2017-03-17 Thread Ville Syrjälä
On Fri, Mar 17, 2017 at 09:13:39AM -0700, Matt Turner wrote:
> On Fri, Mar 17, 2017 at 9:05 AM, Jason Ekstrand <ja...@jlekstrand.net> wrote:
> > On Fri, Mar 17, 2017 at 8:45 AM, Matt Turner <matts...@gmail.com> wrote:
> >>
> >> On Thu, Mar 16, 2017 at 2:09 PM, Jason Ekstrand <ja...@jlekstrand.net>
> >> wrote:
> >> > Older versions of GCC don't like compound literals in static const
> >> > variable declarations because they don't think it's an actual constant
> >> > value.
> >>
> >> Probably because the type it was cast to was explicitly non-const.
> >
> >
> > So you think it will work if we use "(const VkExtent3D) { 1, 1, 1 }"?
> 
> Yep, that's my guess.

Pretty sure I tried that for the format stuff. Didn't work IIRC.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Prefer in-tree headers to out-of-tree headers

2016-11-30 Thread Ville Syrjälä
On Tue, Nov 29, 2016 at 08:54:26AM -0800, Jason Ekstrand wrote:
> On Tue, Nov 29, 2016 at 8:47 AM, Ville Syrjälä <
> ville.syrj...@linux.intel.com> wrote:
> 
> > On Tue, Nov 29, 2016 at 08:28:55AM -0800, Jason Ekstrand wrote:
> > > On Tue, Nov 29, 2016 at 1:23 AM, <ville.syrj...@linux.intel.com> wrote:
> > >
> > > > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > >
> > > > Set the include paths to consider in-tree headers before out-of-tree
> > > > headers.
> > > >
> > > > Avoids the build failing due to stale headers being present in
> > > > $prefix. Previosuly 'make -ki install' or something similar was
> > required
> > > > to update the out-of-tree headers to allow the build to succeed.
> > > >
> > > > Also avoids having to rebuild the entire thing after every 'make
> > > > install'.
> > > >
> > > > Cc: Rob Clark <robdcl...@gmail.com>
> > > > Cc: Jason Ekstrand <jason.ekstr...@intel.com>
> > > > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > > > ---
> > > >  src/intel/vulkan/Makefile.am | 16 +++-
> > > >  1 file changed, 11 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/src/intel/vulkan/Makefile.am
> > b/src/intel/vulkan/Makefile.am
> > > > index ce31abb2fce7..b80f8767296c 100644
> > > > --- a/src/intel/vulkan/Makefile.am
> > > > +++ b/src/intel/vulkan/Makefile.am
> > > > @@ -43,9 +43,6 @@ noinst_LTLIBRARIES = $(PER_GEN_LIBS)
> > > >  # The gallium includes are for the util/u_math.h include from
> > > > main/macros.h
> > > >
> > > >  AM_CPPFLAGS = \
> > > > -   $(INTEL_CFLAGS) \
> > > > -   $(VALGRIND_CFLAGS) \
> > > > -   $(DEFINES) \
> > > > -I$(top_srcdir)/include \
> > > > -I$(top_builddir)/src \
> > > > -I$(top_srcdir)/src \
> > > > @@ -61,6 +58,17 @@ AM_CPPFLAGS = \
> > > > -I$(top_builddir)/src/intel \
> > > > -I$(top_srcdir)/src/intel
> > > >
> > > > +if HAVE_PLATFORM_WAYLAND
> > > > +AM_CPPFLAGS += \
> > > > +   -I$(top_builddir)/src/egl/wayland/wayland-drm \
> > > > +   -I$(top_srcdir)/src/egl/wayland/wayland-drm
> > > > +endif
> > > >
> > >
> > > I think I have a mild preference for keeping wayland stuff together and
> > > moving the last AM_CPPFLAGS down but I don't care that much.  Either way,
> >
> > Would we still want all internal -I knobs to appear before any
> > external ones? To do that I'd have flip the x11 vs. wayland stuff
> > around. And if someone were to add internal -I knobs for x11 then
> > this scheme wouldn't work either way.
> >
> 
> Oh... I dind't think about that interaction.  Go with the way you had it
> for now.
> 
> 
> > Not sure if sticking to a strict global ordering like that is needed,
> > but at least it would seem a bit easier to maintain as you wouldn't
> > have to think too hard when adding new flags.
> >
> 
> We could have two things and then combine them later but that seems a bit
> painful.  Feel free to ignore my comments. :)

I did ;) Patch pushed. Thanks for the reviews.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Prefer in-tree headers to out-of-tree headers

2016-11-29 Thread Ville Syrjälä
On Tue, Nov 29, 2016 at 08:28:55AM -0800, Jason Ekstrand wrote:
> On Tue, Nov 29, 2016 at 1:23 AM, <ville.syrj...@linux.intel.com> wrote:
> 
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> >
> > Set the include paths to consider in-tree headers before out-of-tree
> > headers.
> >
> > Avoids the build failing due to stale headers being present in
> > $prefix. Previosuly 'make -ki install' or something similar was required
> > to update the out-of-tree headers to allow the build to succeed.
> >
> > Also avoids having to rebuild the entire thing after every 'make
> > install'.
> >
> > Cc: Rob Clark <robdcl...@gmail.com>
> > Cc: Jason Ekstrand <jason.ekstr...@intel.com>
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  src/intel/vulkan/Makefile.am | 16 +++-
> >  1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> > index ce31abb2fce7..b80f8767296c 100644
> > --- a/src/intel/vulkan/Makefile.am
> > +++ b/src/intel/vulkan/Makefile.am
> > @@ -43,9 +43,6 @@ noinst_LTLIBRARIES = $(PER_GEN_LIBS)
> >  # The gallium includes are for the util/u_math.h include from
> > main/macros.h
> >
> >  AM_CPPFLAGS = \
> > -   $(INTEL_CFLAGS) \
> > -   $(VALGRIND_CFLAGS) \
> > -   $(DEFINES) \
> > -I$(top_srcdir)/include \
> > -I$(top_builddir)/src \
> > -I$(top_srcdir)/src \
> > @@ -61,6 +58,17 @@ AM_CPPFLAGS = \
> > -I$(top_builddir)/src/intel \
> > -I$(top_srcdir)/src/intel
> >
> > +if HAVE_PLATFORM_WAYLAND
> > +AM_CPPFLAGS += \
> > +   -I$(top_builddir)/src/egl/wayland/wayland-drm \
> > +   -I$(top_srcdir)/src/egl/wayland/wayland-drm
> > +endif
> >
> 
> I think I have a mild preference for keeping wayland stuff together and
> moving the last AM_CPPFLAGS down but I don't care that much.  Either way,

Would we still want all internal -I knobs to appear before any
external ones? To do that I'd have flip the x11 vs. wayland stuff
around. And if someone were to add internal -I knobs for x11 then
this scheme wouldn't work either way.

Not sure if sticking to a strict global ordering like that is needed,
but at least it would seem a bit easier to maintain as you wouldn't
have to think too hard when adding new flags.

> 
> Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> 
> 
> > +
> > +AM_CPPFLAGS += \
> > +   $(INTEL_CFLAGS) \
> > +   $(VALGRIND_CFLAGS) \
> > +   $(DEFINES)
> > +
> >  AM_CFLAGS = \
> > $(VISIBILITY_CFLAGS) \
> > -Wno-override-init -msse2
> > @@ -99,8 +107,6 @@ endif
> >
> >  if HAVE_PLATFORM_WAYLAND
> >  AM_CPPFLAGS += \
> > -   -I$(top_builddir)/src/egl/wayland/wayland-drm \
> > -   -I$(top_srcdir)/src/egl/wayland/wayland-drm \
> > $(WAYLAND_CFLAGS) \
> > -DVK_USE_PLATFORM_WAYLAND_KHR
> >
> > --
> > 2.7.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/formats: Fix build on gcc-4 and earlier

2016-10-03 Thread Ville Syrjälä
On Fri, Sep 30, 2016 at 04:11:51PM -0700, Jason Ekstrand wrote:
> I remember the first tone you fixed this but when I was going the
> ISL_SWIZZLE stuff,

Wasn't me, or at least I can't recall doing anything of the sort :)

> I couldn't find it in the git log so I went ahead and
> pushed the change. Thanks for fixing it again.
> 
> Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>

Thanks. Pushed.

> 
> On Sep 30, 2016 1:00 PM, <ville.syrj...@linux.intel.com> wrote:
> 
> > From: Ville Syrjälä <ville.syrj...@linux.intel.com>
> >
> > gcc-4 and earlier don't allow compound literals where a constant
> > is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
> > when populating the anv_formats[] array. There are a few ways around
> > it: First one would be -std=c89/gnu89, but the rest of the code
> > depends on c99 so it's not really an option. The second option
> > would be to upgrade to gcc-5+ where the compiler behaviour was relaxed
> > a bit [1]. And the third option is just to avoid using compound
> > literals. I chose the last option since it keeps gcc-4 and earlier
> > working.
> >
> > [1] https://gcc.gnu.org/gcc-5/porting_to.html
> >
> > Cc: Jason Ekstrand <ja...@jlekstrand.net>
> > Cc: Topi Pohjolainen <topi.pohjolai...@intel.com>
> > Fixes: 7ddb21708c80 ("intel/isl: Add an isl_swizzle structure and use it
> > for isl_view swizzles")
> > Signed-off-by: Ville Syrjälä <ville.syrj...@linux.intel.com>
> > ---
> >  src/intel/vulkan/anv_formats.c | 23 +++
> >  1 file changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_
> > formats.c
> > index 7341d725cd0a..f6915540fb3d 100644
> > --- a/src/intel/vulkan/anv_formats.c
> > +++ b/src/intel/vulkan/anv_formats.c
> > @@ -24,9 +24,24 @@
> >  #include "anv_private.h"
> >  #include "vk_format_info.h"
> >
> > -#define RGBA ISL_SWIZZLE(RED, GREEN, BLUE, ALPHA)
> > -#define BGRA ISL_SWIZZLE(BLUE, GREEN, RED, ALPHA)
> > -#define RGB1 ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
> > +/*
> > + * gcc-4 and earlier don't allow compound literals where a constant
> > + * is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE()
> > + * here. -std=c89/gnu89 would allow it, but we depend on c99 features
> > + * so using -std=c89/gnu89 is not an option. Starting from gcc-5
> > + * compound literals can also be considered constant in -std=c99/gnu99
> > + * mode.
> > + */
> > +#define _ISL_SWIZZLE(r, g, b, a) { \
> > +  ISL_CHANNEL_SELECT_##r, \
> > +  ISL_CHANNEL_SELECT_##g, \
> > +  ISL_CHANNEL_SELECT_##b, \
> > +  ISL_CHANNEL_SELECT_##a, \
> > +}
> > +
> > +#define RGBA _ISL_SWIZZLE(RED, GREEN, BLUE, ALPHA)
> > +#define BGRA _ISL_SWIZZLE(BLUE, GREEN, RED, ALPHA)
> > +#define RGB1 _ISL_SWIZZLE(RED, GREEN, BLUE, ONE)
> >
> >  #define swiz_fmt(__vk_fmt, __hw_fmt, __swizzle) \
> > [__vk_fmt] = { \
> > @@ -276,7 +291,7 @@ anv_get_format(const struct gen_device_info *devinfo,
> > VkFormat vk_format,
> >   format.isl_format = rgbx;
> >} else {
> >   format.isl_format = isl_format_rgb_to_rgba(format.isl_format);
> > - format.swizzle = RGB1;
> > + format.swizzle = ISL_SWIZZLE(RED, GREEN, BLUE, ONE);
> >}
> > }
> >
> > --
> > 2.7.4
> >
> >

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gbm: Add flags to enable creation of rotated scanout buffers (v3)

2015-11-06 Thread Ville Syrjälä
On Fri, Nov 06, 2015 at 06:56:13PM +0900, Michel Dänzer wrote:
> On 06.11.2015 12:08, Vivek Kasireddy wrote:
> > For certain platforms that support rotated scanout buffers, currently,
> > there is no way to create them with the GBM DRI interface. These flags
> > will instruct the DRI driver to create the buffer by setting
> > additional requirements such as tiling mode.
> > 
> > v2: Reserve a bit per angle. (Ville and Michel)
> > 
> > v3:
> > - Combine all GBM_BO_USE_SCANOUT_ROTATION_* flags into
> >   GBM_BO_USE_SCANOUT_ANY macro (Michel)
> > - Pull the code that updates dri_use based on the rotation flag
> >   into a separate function.
> 
> [...]
> 
> > diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> > index 57cdeac..741e509 100644
> > --- a/src/gbm/backends/dri/gbm_dri.c
> > +++ b/src/gbm/backends/dri/gbm_dri.c
> > @@ -124,6 +124,20 @@ image_get_buffers(__DRIdrawable *driDrawable,
> >  }
> >  
> >  static void
> > +gbm_to_dri_flag(uint32_t usage,
> > +unsigned *dri_use)
> > +{
> > +   if (usage & GBM_BO_USE_SCANOUT)
> > +  *dri_use |= __DRI_IMAGE_USE_SCANOUT;
> > +   if (usage & GBM_BO_USE_SCANOUT_ROTATION_90)
> > +  *dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATION_90;
> > +   if (usage & GBM_BO_USE_SCANOUT_ROTATION_180)
> > +  *dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATION_180;
> > +   if (usage & GBM_BO_USE_SCANOUT_ROTATION_270)
> > +  *dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATION_270;
> > +}
> 
> I like the idea of this helper function, but it could handle
> GBM_BO_USE_CURSOR and GBM_BO_USE_LINEAR as well. Ideally, there would be
> a separate preparatory change which creates the helper function and
> makes gbm_dri_bo_import and gbm_dri_bo_create use it; then this change
> can just add the new flags in the helper function. If that's too much
> trouble, the handling of the other flags can be moved into the helper
> function in a followup change.
> 
> 
> > diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
> > index 8db2153..4bda089 100644
> > --- a/src/gbm/main/gbm.h
> > +++ b/src/gbm/main/gbm.h
> > @@ -214,6 +214,13 @@ enum gbm_bo_flags {
> >  * Buffer is linear, i.e. not tiled.
> >  */
> > GBM_BO_USE_LINEAR = (1 << 4),
> > +   /**
> > +* Buffer would be rotated and some platforms have additional tiling
> > +* requirements for rotated scanout buffers.
> > +*/
> > +   GBM_BO_USE_SCANOUT_ROTATION_90 = (1 << 5),
> > +   GBM_BO_USE_SCANOUT_ROTATION_180 = (1 << 6),
> > +   GBM_BO_USE_SCANOUT_ROTATION_270 = (1 << 7),
> >  };
> 
> Hmm, we should probably explicitly specify the orientation of the 90 and
> 270 degree rotations. Clockwise? (Same in patch 1)

kms (and xrandr) are ccw.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gbm: Add a flag to enable creation of rotated scanout buffers

2015-10-27 Thread Ville Syrjälä
On Fri, Oct 23, 2015 at 06:25:55PM -0700, Vivek Kasireddy wrote:
> On Fri, 23 Oct 2015 15:29:08 +0300
> Ville Syrjälä <ville.syrj...@linux.intel.com> wrote:
> 
> > On Fri, Oct 23, 2015 at 12:18:39PM +0900, Michel Dänzer wrote:
> > > On 23.10.2015 10:44, Vivek Kasireddy wrote:
> > > > For certain platforms that support rotated scanout buffers,
> > > > currently, there is no way to create them with the GBM DRI
> > > > interface. This flag will instruct the DRI driver to create the
> > > > buffer by setting additional requirements.
> > > > 
> > > > Cc: Kristian Hogsberg <k...@bitplanet.net>
> > > > Signed-off-by: Vivek Kasireddy <vivek.kasire...@intel.com>
> > > > ---
> > > >  include/GL/internal/dri_interface.h | 1 +
> > > >  src/gbm/backends/dri/gbm_dri.c  | 9 +++--
> > > >  src/gbm/main/gbm.h  | 5 +
> > > >  3 files changed, 13 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/include/GL/internal/dri_interface.h
> > > > b/include/GL/internal/dri_interface.h index a0f155a..2271217
> > > > 100644 --- a/include/GL/internal/dri_interface.h
> > > > +++ b/include/GL/internal/dri_interface.h
> > > > @@ -1091,6 +1091,7 @@ struct __DRIdri2ExtensionRec {
> > > >  #define __DRI_IMAGE_USE_SCANOUT0x0002
> > > >  #define __DRI_IMAGE_USE_CURSOR 0x0004 /*
> > > > Depricated */ #define __DRI_IMAGE_USE_LINEAR0x0008
> > > > +#define __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270
> > > > 0x0010 
> > > >  
> > > >  /**
> > > 
> > > Thank you for splitting out the driver change. Sorry I didn't think
> > > of this before, but it might be worth splitting out the
> > > dri_interface.h change as well. I'm fine either way, though.
> > > 
> > > 
> > > > diff --git a/src/gbm/backends/dri/gbm_dri.c
> > > > b/src/gbm/backends/dri/gbm_dri.c index 57cdeac..cde63de 100644
> > > > --- a/src/gbm/backends/dri/gbm_dri.c
> > > > +++ b/src/gbm/backends/dri/gbm_dri.c
> > > > @@ -539,7 +539,7 @@ gbm_dri_is_format_supported(struct gbm_device
> > > > *gbm, break;
> > > > case GBM_BO_FORMAT_ARGB:
> > > > case GBM_FORMAT_ARGB:
> > > > -  if (usage & GBM_BO_USE_SCANOUT)
> > > > +  if (usage & (GBM_BO_USE_SCANOUT |
> > > > GBM_BO_USE_SCANOUT_ROTATED_90_270)) return 0;
> > > >break;
> > > > default:
> > > > @@ -748,6 +748,8 @@ gbm_dri_bo_import(struct gbm_device *gbm,
> > > >  
> > > > if (usage & GBM_BO_USE_SCANOUT)
> > > >dri_use |= __DRI_IMAGE_USE_SCANOUT;
> > > > +   if (usage & GBM_BO_USE_SCANOUT_ROTATED_90_270)
> > > > +  dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270;
> > > > if (usage & GBM_BO_USE_CURSOR)
> > > >dri_use |= __DRI_IMAGE_USE_CURSOR;
> > > > if (dri->image->base.version >= 2 &&
> > > > @@ -786,7 +788,8 @@ create_dumb(struct gbm_device *gbm,
> > > >  
> > > > is_cursor = (usage & GBM_BO_USE_CURSOR) != 0 &&
> > > >format == GBM_FORMAT_ARGB;
> > > > -   is_scanout = (usage & GBM_BO_USE_SCANOUT) != 0 &&
> > > > +   is_scanout = (usage & (GBM_BO_USE_SCANOUT |
> > > > +  GBM_BO_USE_SCANOUT_ROTATED_90_270)) != 0 &&
> > > >format == GBM_FORMAT_XRGB;
> > > > if (!is_cursor && !is_scanout) {
> > > >errno = EINVAL;
> > > > @@ -880,6 +883,8 @@ gbm_dri_bo_create(struct gbm_device *gbm,
> > > >  
> > > > if (usage & GBM_BO_USE_SCANOUT)
> > > >dri_use |= __DRI_IMAGE_USE_SCANOUT;
> > > > +   if (usage & GBM_BO_USE_SCANOUT_ROTATED_90_270)
> > > > +  dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270;
> > > > if (usage & GBM_BO_USE_CURSOR)
> > > >dri_use |= __DRI_IMAGE_USE_CURSOR;
> > > > if (usage & GBM_BO_USE_LINEAR)
> > > > diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
> > > > index 2708e50..2ef7bd8 100644
> > > > --- a/src/gbm/main/gbm.h
> > > > +++ b/src/gbm/main/gbm.h
> > > > @@ -213,6 +213,11 @@ enum gbm_bo_flags {
> > > >  * Buffer is linear, i.e. not tiled.

Re: [Mesa-dev] [PATCH 1/2] gbm: Add a flag to enable creation of rotated scanout buffers

2015-10-23 Thread Ville Syrjälä
On Fri, Oct 23, 2015 at 12:18:39PM +0900, Michel Dänzer wrote:
> On 23.10.2015 10:44, Vivek Kasireddy wrote:
> > For certain platforms that support rotated scanout buffers, currently,
> > there is no way to create them with the GBM DRI interface. This flag
> > will instruct the DRI driver to create the buffer by setting
> > additional requirements.
> > 
> > Cc: Kristian Hogsberg <k...@bitplanet.net>
> > Signed-off-by: Vivek Kasireddy <vivek.kasire...@intel.com>
> > ---
> >  include/GL/internal/dri_interface.h | 1 +
> >  src/gbm/backends/dri/gbm_dri.c  | 9 +++--
> >  src/gbm/main/gbm.h  | 5 +
> >  3 files changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/GL/internal/dri_interface.h 
> > b/include/GL/internal/dri_interface.h
> > index a0f155a..2271217 100644
> > --- a/include/GL/internal/dri_interface.h
> > +++ b/include/GL/internal/dri_interface.h
> > @@ -1091,6 +1091,7 @@ struct __DRIdri2ExtensionRec {
> >  #define __DRI_IMAGE_USE_SCANOUT0x0002
> >  #define __DRI_IMAGE_USE_CURSOR 0x0004 /* Depricated */
> >  #define __DRI_IMAGE_USE_LINEAR 0x0008
> > +#define __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270 0x0010
> >  
> >  
> >  /**
> 
> Thank you for splitting out the driver change. Sorry I didn't think of
> this before, but it might be worth splitting out the dri_interface.h
> change as well. I'm fine either way, though.
> 
> 
> > diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> > index 57cdeac..cde63de 100644
> > --- a/src/gbm/backends/dri/gbm_dri.c
> > +++ b/src/gbm/backends/dri/gbm_dri.c
> > @@ -539,7 +539,7 @@ gbm_dri_is_format_supported(struct gbm_device *gbm,
> >break;
> > case GBM_BO_FORMAT_ARGB:
> > case GBM_FORMAT_ARGB:
> > -  if (usage & GBM_BO_USE_SCANOUT)
> > +  if (usage & (GBM_BO_USE_SCANOUT | GBM_BO_USE_SCANOUT_ROTATED_90_270))
> >   return 0;
> >break;
> > default:
> > @@ -748,6 +748,8 @@ gbm_dri_bo_import(struct gbm_device *gbm,
> >  
> > if (usage & GBM_BO_USE_SCANOUT)
> >dri_use |= __DRI_IMAGE_USE_SCANOUT;
> > +   if (usage & GBM_BO_USE_SCANOUT_ROTATED_90_270)
> > +  dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270;
> > if (usage & GBM_BO_USE_CURSOR)
> >dri_use |= __DRI_IMAGE_USE_CURSOR;
> > if (dri->image->base.version >= 2 &&
> > @@ -786,7 +788,8 @@ create_dumb(struct gbm_device *gbm,
> >  
> > is_cursor = (usage & GBM_BO_USE_CURSOR) != 0 &&
> >format == GBM_FORMAT_ARGB;
> > -   is_scanout = (usage & GBM_BO_USE_SCANOUT) != 0 &&
> > +   is_scanout = (usage & (GBM_BO_USE_SCANOUT |
> > +  GBM_BO_USE_SCANOUT_ROTATED_90_270)) != 0 &&
> >format == GBM_FORMAT_XRGB;
> > if (!is_cursor && !is_scanout) {
> >errno = EINVAL;
> > @@ -880,6 +883,8 @@ gbm_dri_bo_create(struct gbm_device *gbm,
> >  
> > if (usage & GBM_BO_USE_SCANOUT)
> >dri_use |= __DRI_IMAGE_USE_SCANOUT;
> > +   if (usage & GBM_BO_USE_SCANOUT_ROTATED_90_270)
> > +  dri_use |= __DRI_IMAGE_USE_SCANOUT_ROTATED_90_270;
> > if (usage & GBM_BO_USE_CURSOR)
> >dri_use |= __DRI_IMAGE_USE_CURSOR;
> > if (usage & GBM_BO_USE_LINEAR)
> > diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
> > index 2708e50..2ef7bd8 100644
> > --- a/src/gbm/main/gbm.h
> > +++ b/src/gbm/main/gbm.h
> > @@ -213,6 +213,11 @@ enum gbm_bo_flags {
> >  * Buffer is linear, i.e. not tiled.
> >  */
> > GBM_BO_USE_LINEAR = (1 << 4),
> > +   /**
> > +* Buffer would be rotated and some platforms have additional tiling
> > +* requirements for 90/270 rotated buffers.
> > +*/
> > +   GBM_BO_USE_SCANOUT_ROTATED_90_270 = (1 << 5),
> >  };
> >  
> >  int
> > 
> 
> I asked internally, and apparently our display hardware requires a
> rotation specific tiling mode for 180 degree rotation as well. In order
> to avoid having to add *_SCANOUT_ROTATED_180 later, would
> *_SCANOUT_ROTATED work for you as well? Or would using Y-tiling for 180
> degree rotation be an issue?

What about a bit per angle? To avoid hardware specifics.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-21 Thread Ville Syrjälä
On Tue, Oct 20, 2015 at 02:02:21PM +0300, Ville Syrjälä wrote:
> On Tue, Oct 20, 2015 at 08:15:32AM +, Predut, Marius wrote:
> > > -Original Message-
> > > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > > Sent: Monday, October 19, 2015 6:04 PM
> > > To: Predut, Marius
> > > Cc: mesa-dev@lists.freedesktop.org; Iago Toral Quiroga
> > > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for 
> > > thinnest
> > > width lines
> > > 
> > > On Thu, Oct 15, 2015 at 06:03:34PM +0300, Ville Syrjälä wrote:
> > > > On Thu, Oct 15, 2015 at 02:19:09PM +, Predut, Marius wrote:
> > > > > > -Original Message-
> > > > > > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > > > > > Sent: Wednesday, October 07, 2015 1:53 PM
> > > > > > To: Predut, Marius
> > > > > > Cc: mesa-dev@lists.freedesktop.org
> > > > > > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug
> > > > > > for thinnest width lines
> > > > > >
> > > > > > On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > > > > > > On PNV platform, for 1 pixel line thickness or less, the general
> > > > > > > anti-aliasing algorithm gives up, and a garbage line is generated.
> > > > > > > Setting a Line Width of 0.0 specifies the rasterization of the
> > > > > > > "thinnest" (one-pixel-wide), non-antialiased lines.
> > > > > > > Lines rendered with zero Line Width are rasterized using Grid
> > > > > > > Intersection Quantization rules as specified by
> > > > > > > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f
> > > > > > > of the
> > > > > > > GEN3 docs.
> > > > > > > The patch was tested on Intel Atom CPU N455.
> > > > > > >
> > > > > > > This patch follow the same rules as patches fixing the
> > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > > > > > > bug.
> > > > > > >
> > > > > > > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > > > > > > v2: Ian Romanick: comments fix.
> > > > > > >
> > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> > > > > > >
> > > > > > > Signed-off-by: Marius Predut <marius.pre...@intel.com>
> > > > > > > ---
> > > > > > >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> > > > > > >  1 file changed, 15 insertions(+)
> > > > > > >
> > > > > > > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > > b/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > > index 4c83073..897eb59 100644
> > > > > > > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx,
> > > > > > > GLfloat
> > > > > > > widthf)
> > > > > > >
> > > > > > > width = (int) (widthf * 2);
> > > > > > > width = CLAMP(width, 1, 0xf);
> > > > > > > +
> > > > > > > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > > > > > > + /* For 1 pixel line thickness or less, the general
> > > > > > > +  * anti-aliasing algorithm gives up, and a garbage line is
> > > > > > > +  * generated.  Setting a Line Width of 0.0 specifies the
> > > > > > > +  * rasterization of the "thinnest" (one-pixel-wide),
> > > > > > > +  * non-antialiased lines.
> > > > > > > +  *
> > > > > > > +  * Lines rendered with zero Line Width are rasterized using
> > > > > > > +  * Grid Intersection Quantization rules as specified by
> > > > > > > +  * volume 1f of the GEN3 docs,
> > > > > > > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > > > > > > +  */
> > > > > > > +  width = 0;
> > > > &

Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-20 Thread Ville Syrjälä
On Tue, Oct 20, 2015 at 08:15:32AM +, Predut, Marius wrote:
> > -Original Message-
> > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > Sent: Monday, October 19, 2015 6:04 PM
> > To: Predut, Marius
> > Cc: mesa-dev@lists.freedesktop.org; Iago Toral Quiroga
> > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for 
> > thinnest
> > width lines
> > 
> > On Thu, Oct 15, 2015 at 06:03:34PM +0300, Ville Syrjälä wrote:
> > > On Thu, Oct 15, 2015 at 02:19:09PM +, Predut, Marius wrote:
> > > > > -Original Message-
> > > > > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > > > > Sent: Wednesday, October 07, 2015 1:53 PM
> > > > > To: Predut, Marius
> > > > > Cc: mesa-dev@lists.freedesktop.org
> > > > > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug
> > > > > for thinnest width lines
> > > > >
> > > > > On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > > > > > On PNV platform, for 1 pixel line thickness or less, the general
> > > > > > anti-aliasing algorithm gives up, and a garbage line is generated.
> > > > > > Setting a Line Width of 0.0 specifies the rasterization of the
> > > > > > "thinnest" (one-pixel-wide), non-antialiased lines.
> > > > > > Lines rendered with zero Line Width are rasterized using Grid
> > > > > > Intersection Quantization rules as specified by
> > > > > > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f
> > > > > > of the
> > > > > > GEN3 docs.
> > > > > > The patch was tested on Intel Atom CPU N455.
> > > > > >
> > > > > > This patch follow the same rules as patches fixing the
> > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > > > > > bug.
> > > > > >
> > > > > > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > > > > > v2: Ian Romanick: comments fix.
> > > > > >
> > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> > > > > >
> > > > > > Signed-off-by: Marius Predut <marius.pre...@intel.com>
> > > > > > ---
> > > > > >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> > > > > >  1 file changed, 15 insertions(+)
> > > > > >
> > > > > > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > b/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > index 4c83073..897eb59 100644
> > > > > > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > > > > > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx,
> > > > > > GLfloat
> > > > > > widthf)
> > > > > >
> > > > > > width = (int) (widthf * 2);
> > > > > > width = CLAMP(width, 1, 0xf);
> > > > > > +
> > > > > > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > > > > > + /* For 1 pixel line thickness or less, the general
> > > > > > +  * anti-aliasing algorithm gives up, and a garbage line is
> > > > > > +  * generated.  Setting a Line Width of 0.0 specifies the
> > > > > > +  * rasterization of the "thinnest" (one-pixel-wide),
> > > > > > +  * non-antialiased lines.
> > > > > > +  *
> > > > > > +  * Lines rendered with zero Line Width are rasterized using
> > > > > > +  * Grid Intersection Quantization rules as specified by
> > > > > > +  * volume 1f of the GEN3 docs,
> > > > > > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > > > > > +  */
> > > > > > +  width = 0;
> > > > > > +   }
> > > > >
> > > > > I went to do some spec reading, and while I can't confirm the AA
> > > > > <= 1.0 problem (no mention in the spec about such things), I can
> > > > > see this fix alone isn't sufficient to satisfy the spec (we lack
> > > > > the round to nearest integer for non-aa for instance).
> > > >
> > &

Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-19 Thread Ville Syrjälä
On Thu, Oct 15, 2015 at 06:03:34PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 15, 2015 at 02:19:09PM +, Predut, Marius wrote:
> > > -Original Message-
> > > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > > Sent: Wednesday, October 07, 2015 1:53 PM
> > > To: Predut, Marius
> > > Cc: mesa-dev@lists.freedesktop.org
> > > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for 
> > > thinnest
> > > width lines
> > > 
> > > On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > > > On PNV platform, for 1 pixel line thickness or less, the general
> > > > anti-aliasing algorithm gives up, and a garbage line is generated.
> > > > Setting a Line Width of 0.0 specifies the rasterization of the
> > > > "thinnest" (one-pixel-wide), non-antialiased lines.
> > > > Lines rendered with zero Line Width are rasterized using Grid
> > > > Intersection Quantization rules as specified by
> > > > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f of the
> > > > GEN3 docs.
> > > > The patch was tested on Intel Atom CPU N455.
> > > >
> > > > This patch follow the same rules as patches fixing the
> > > > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > > > bug.
> > > >
> > > > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > > > v2: Ian Romanick: comments fix.
> > > >
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> > > >
> > > > Signed-off-by: Marius Predut <marius.pre...@intel.com>
> > > > ---
> > > >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> > > >  1 file changed, 15 insertions(+)
> > > >
> > > > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > > > b/src/mesa/drivers/dri/i915/i915_state.c
> > > > index 4c83073..897eb59 100644
> > > > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > > > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > > > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat
> > > > widthf)
> > > >
> > > > width = (int) (widthf * 2);
> > > > width = CLAMP(width, 1, 0xf);
> > > > +
> > > > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > > > + /* For 1 pixel line thickness or less, the general
> > > > +  * anti-aliasing algorithm gives up, and a garbage line is
> > > > +  * generated.  Setting a Line Width of 0.0 specifies the
> > > > +  * rasterization of the "thinnest" (one-pixel-wide),
> > > > +  * non-antialiased lines.
> > > > +  *
> > > > +  * Lines rendered with zero Line Width are rasterized using
> > > > +  * Grid Intersection Quantization rules as specified by
> > > > +  * volume 1f of the GEN3 docs,
> > > > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > > > +  */
> > > > +  width = 0;
> > > > +   }
> > > 
> > > I went to do some spec reading, and while I can't confirm the AA <= 1.0
> > > problem (no mention in the spec about such things), I can see this fix 
> > > alone
> > > isn't sufficient to satisfy the spec (we lack the round to nearest 
> > > integer for
> > > non-aa for instance).
> > 
> > Ville ,Thanks for review!
> > On this seem not too much docs, here can use experiments or docs for next 
> > GEN+.
> > 
> > > 
> > > I think what we'd want is a small helper. i965 has one, although that one
> > > looks quite messy. I think this is how I'd write the helper for
> > > i915:
> > > 
> > > unsigned intel_line_width(ctx)
> > > {
> > >   float line_width = ctx->Line.Width;
> > > 
> > >   if (ctx->Line.SmoothFlag)
> > >   line_width = CLAMP(line_width, MinAA, MaxAA);
> > >   else
> > >   line_width = CLAMP(roundf(line_width), Min, Max);
> > > 
> > >   /*
> > >* blah
> > >*/
> > >   if (line_width < 1.5f)
> > >   line_width = 0.0f
> > > 
> > >   return U_FIXED(line_width, 1);
> > > }
> > > 
> > > and then use it for both gen2 and gen3 state setup.
> > 
> > Do you used this and it works for you? (I mean if you did a 

Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-15 Thread Ville Syrjälä
On Thu, Oct 15, 2015 at 02:19:09PM +, Predut, Marius wrote:
> > -Original Message-
> > From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
> > Sent: Wednesday, October 07, 2015 1:53 PM
> > To: Predut, Marius
> > Cc: mesa-dev@lists.freedesktop.org
> > Subject: Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for 
> > thinnest
> > width lines
> > 
> > On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> > > On PNV platform, for 1 pixel line thickness or less, the general
> > > anti-aliasing algorithm gives up, and a garbage line is generated.
> > > Setting a Line Width of 0.0 specifies the rasterization of the
> > > "thinnest" (one-pixel-wide), non-antialiased lines.
> > > Lines rendered with zero Line Width are rasterized using Grid
> > > Intersection Quantization rules as specified by
> > > 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f of the
> > > GEN3 docs.
> > > The patch was tested on Intel Atom CPU N455.
> > >
> > > This patch follow the same rules as patches fixing the
> > > https://bugs.freedesktop.org/show_bug.cgi?id=28832
> > > bug.
> > >
> > > v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> > > v2: Ian Romanick: comments fix.
> > >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> > >
> > > Signed-off-by: Marius Predut <marius.pre...@intel.com>
> > > ---
> > >  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
> > >  1 file changed, 15 insertions(+)
> > >
> > > diff --git a/src/mesa/drivers/dri/i915/i915_state.c
> > > b/src/mesa/drivers/dri/i915/i915_state.c
> > > index 4c83073..897eb59 100644
> > > --- a/src/mesa/drivers/dri/i915/i915_state.c
> > > +++ b/src/mesa/drivers/dri/i915/i915_state.c
> > > @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat
> > > widthf)
> > >
> > > width = (int) (widthf * 2);
> > > width = CLAMP(width, 1, 0xf);
> > > +
> > > +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> > > + /* For 1 pixel line thickness or less, the general
> > > +  * anti-aliasing algorithm gives up, and a garbage line is
> > > +  * generated.  Setting a Line Width of 0.0 specifies the
> > > +  * rasterization of the "thinnest" (one-pixel-wide),
> > > +  * non-antialiased lines.
> > > +  *
> > > +  * Lines rendered with zero Line Width are rasterized using
> > > +  * Grid Intersection Quantization rules as specified by
> > > +  * volume 1f of the GEN3 docs,
> > > +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization.
> > > +  */
> > > +  width = 0;
> > > +   }
> > 
> > I went to do some spec reading, and while I can't confirm the AA <= 1.0
> > problem (no mention in the spec about such things), I can see this fix alone
> > isn't sufficient to satisfy the spec (we lack the round to nearest integer 
> > for
> > non-aa for instance).
> 
> Ville ,Thanks for review!
> On this seem not too much docs, here can use experiments or docs for next 
> GEN+.
> 
> > 
> > I think what we'd want is a small helper. i965 has one, although that one
> > looks quite messy. I think this is how I'd write the helper for
> > i915:
> > 
> > unsigned intel_line_width(ctx)
> > {
> > float line_width = ctx->Line.Width;
> > 
> > if (ctx->Line.SmoothFlag)
> > line_width = CLAMP(line_width, MinAA, MaxAA);
> > else
> > line_width = CLAMP(roundf(line_width), Min, Max);
> > 
> > /*
> >  * blah
> >  */
> > if (line_width < 1.5f)
> > line_width = 0.0f
> > 
> > return U_FIXED(line_width, 1);
> > }
> > 
> > and then use it for both gen2 and gen3 state setup.
> 
> Do you used this and it works for you? (I mean if you did a test on your PNV 
> platform)

Didn't do any actual testing yet. I've been meaning to, but just been
too busy with other stuff. I can try to test on pnv today, and maybe on
830 and 85x on the weekend. Hmm, I wonder if the test even works on
gl1?

> I have some comments on the Bugzilla related to SmoothFlag flag.(on 
> 2015-06-04).
> On my tests seems the flag is set only if call glLineWidth (lineWidth), 
> lineWidth != 1.
> 
> > 
> > The clamp part could even ve moved to some central place so that all drivers

Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines

2015-10-07 Thread Ville Syrjälä
On Mon, Oct 05, 2015 at 07:55:24PM +0300, Marius Predut wrote:
> On PNV platform, for 1 pixel line thickness or less,
> the general anti-aliasing algorithm gives up, and a garbage line is generated.
> Setting a Line Width of 0.0 specifies the rasterization
> of the "thinnest" (one-pixel-wide), non-antialiased lines.
> Lines rendered with zero Line Width are rasterized using
> Grid Intersection Quantization rules as specified by
> 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization from volume 1f of the GEN3
> docs.
> The patch was tested on Intel Atom CPU N455.
> 
> This patch follow the same rules as patches fixing the
> https://bugs.freedesktop.org/show_bug.cgi?id=28832
> bug.
> 
> v1: Eduardo Lima Mitev:  Wrong indentation inside the if clause.
> v2: Ian Romanick: comments fix.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367
> 
> Signed-off-by: Marius Predut <marius.pre...@intel.com>
> ---
>  src/mesa/drivers/dri/i915/i915_state.c | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i915/i915_state.c 
> b/src/mesa/drivers/dri/i915/i915_state.c
> index 4c83073..897eb59 100644
> --- a/src/mesa/drivers/dri/i915/i915_state.c
> +++ b/src/mesa/drivers/dri/i915/i915_state.c
> @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat widthf)
> 
> width = (int) (widthf * 2);
> width = CLAMP(width, 1, 0xf);
> +
> +   if (ctx->Line.Width < 1.5 || widthf < 1.5) {
> + /* For 1 pixel line thickness or less, the general
> +  * anti-aliasing algorithm gives up, and a garbage line is
> +  * generated.  Setting a Line Width of 0.0 specifies the
> +  * rasterization of the "thinnest" (one-pixel-wide),
> +  * non-antialiased lines.
> +  *
> +  * Lines rendered with zero Line Width are rasterized using
> +  * Grid Intersection Quantization rules as specified by
> +  * volume 1f of the GEN3 docs,
> +  * 2.8.4.1 Zero-Width (Cosmetic) Line Rasterization. 
> +  */
> +  width = 0;
> +   }

I went to do some spec reading, and while I can't confirm the AA <= 1.0
problem (no mention in the spec about such things), I can see this fix
alone isn't sufficient to satisfy the spec (we lack the round to nearest
integer for non-aa for instance).

I think what we'd want is a small helper. i965 has one, although
that one looks quite messy. I think this is how I'd write the helper for
i915:

unsigned intel_line_width(ctx)
{
float line_width = ctx->Line.Width;

if (ctx->Line.SmoothFlag)
line_width = CLAMP(line_width, MinAA, MaxAA);
else
line_width = CLAMP(roundf(line_width), Min, Max);

/*
 * blah
 */
if (line_width < 1.5f)
line_width = 0.0f

return U_FIXED(line_width, 1);
}

and then use it for both gen2 and gen3 state setup.

The clamp part could even ve moved to some central place so that all
drivers could share it, or I suppose we could stash the appropriately
rounded and clamped line width into the context as ctx->Line._Width.

Oh and BTW, the gen4/5 line width handling in i965 looks busted too
(only gen6+ got fixed).

> lis4 |= width << S4_LINE_WIDTH_SHIFT;
>  
> if (lis4 != i915->state.Ctx[I915_CTXREG_LIS4]) {
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] i915: Pile of fixes and cleanups

2015-10-02 Thread Ville Syrjälä
On Sat, Oct 03, 2015 at 02:55:57AM +0800, Chih-Wei Huang wrote:
> 2015-05-22 1:21 GMT+08:00 Ville Syrjälä <ville.syrj...@linux.intel.com>:
> > On Sat, May 16, 2015 at 01:36:44AM +0800, Chih-Wei Huang wrote:
> >> 2015-04-13 19:00 GMT+08:00 Ville Syrjälä <ville.syrj...@linux.intel.com>:
> >>
> >> Glad to see the patches to improve i915 driver.
> >> I thought Intel has given it up.
> >> Thank you!
> >>
> >> I'm happy to test them on my Atom N450 tablet
> >> once they are merged.
> >
> > Available at
> > git://github.com/vsyrjala/mesa.git i915_fixes_7
> > if you can't wait :P
> 
> Hello, how about these i915 patches?
> Seems they are not merged.
> Are they still needed?
> 
> I've tried to apply to mesa 11.0.2
> but there are conflicts I can't resolve.

I've not managed to push them. I was planning to push them but then my summer
vacation was coming up, and I figured it's a bit rude to push and disappear
in case there are any issues. After coming back I naturally forgot all about
it, and when I did eventually rememer 11.0 was just about to be released,
so I thought it's better to push after the release, and then I've just been
too busy with other crap to do anything with these.

But I think Ian recently picked up at least some of them (thanks for that
BTW). I've not found the time to really good look at what patches he had
taken and which were still missing. Ian?

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 1/8] i965: Add a helper function intel_get_tile_dims()

2015-09-10 Thread Ville Syrjälä
On Thu, Sep 10, 2015 at 12:20:10PM -0700, Chad Versace wrote:
> On Wed 19 Aug 2015, Anuj Phogat wrote:
> > V2:
> > - Do the tile width/height computations in the new helper
> >   function and use it later in intel_miptree_get_tile_masks().
> > - Change the name to intel_get_tile_dims().
> > 
> > Cc: Ben Widawsky <b...@bwidawsk.net>
> > Signed-off-by: Anuj Phogat <anuj.pho...@gmail.com>
> > ---
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 81 
> > +++
> >  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 ++
> >  2 files changed, 63 insertions(+), 22 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > index e85c3f0..c282e94 100644
> > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> > @@ -563,35 +563,15 @@ static unsigned long
> >  intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
> >  unsigned long *pitch)
> >  {
> > -   const uint32_t bpp = mt->cpp * 8;
> > -   const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
> > uint32_t tile_width, tile_height;
> > unsigned long stride, size, aligned_y;
> >  
> > assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);
> > -
> > -   switch (bpp) {
> > -   case 8:
> > -  tile_height = 64;
> > -  break;
> > -   case 16:
> > -   case 32:
> > -  tile_height = 32;
> > -  break;
> > -   case 64:
> > -   case 128:
> > -  tile_height = 16;
> > -  break;
> > -   default:
> > -  unreachable("not reached");
> > -   }
> > -
> > -   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
> > -  tile_height *= 4;
> > +   intel_get_tile_dims(mt->tiling, mt->tr_mode, mt->cpp,
> > +   _width, _height);
> >  
> > aligned_y = ALIGN(mt->total_height, tile_height);
> > stride = mt->total_width * mt->cpp;
> > -   tile_width = tile_height * mt->cpp * aspect_ratio;
> > stride = ALIGN(stride, tile_width);
> > size = stride * aligned_y;
> >  
> > @@ -1081,6 +1061,63 @@ intel_miptree_get_image_offset(const struct 
> > intel_mipmap_tree *mt,
> > *y = mt->level[level].slice[slice].y_offset;
> >  }
> >  
> > +
> > +/**
> > + * This function computes the width and height in bytes of different tiling
> > + * patterns. If the BO is untiled, the dimensions are set to cpp.
> > + */
> 
> Is the tile_w parameter in units of bytes or pixels? That should be
> documented at the top of the function.
> 
> Also, just to be clear, "tile height" is always unitless. The hw docs
> sometime express it in units of "rows". But "rows" itself is unitless.
> 
> > +void
> > +intel_get_tile_dims(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
> > +uint32_t *tile_w, uint32_t *tile_h)
> > +{
> > +   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
> > +  switch (tiling) {
> > +  case I915_TILING_X:
> > + *tile_w = 512;
> > + *tile_h = 8 * cpp;
> 
> For legacy tiling formats, the height of a tile is independent of the
> pixel size,  because the height is unitless. For Tile X, it's always
> 2^3. For Tile Y Legacy, it's always 2^5.
> 
> If tile_w is in units of bytes, then it's also independent of pixel
> size. If tile_w is in units of pixels, though, then
> 
> tile_w_pixels = tile_w_bytes / cpp
> 
> 
> > + break;
> > +  case I915_TILING_Y:
> > + *tile_w = 128;
> > + *tile_h = 32 * cpp;
> > + break;
> > +  case I915_TILING_NONE:
> > + *tile_w = cpp;
> > + *tile_h = cpp;
> > + break;
> > +  default:
> > + unreachable("not reached");
> > +  }
> > +   } else {
> > +  uint32_t aspect_ratio = 1;
> > +  assert(_mesa_is_pow_two(cpp));
> > +
> > +  switch (cpp) {
> > +  case 1:
> > + *tile_h = 64 * cpp;
> 
> I'm still reading the docs for the non-legay tiling formats Yf, and Ys.
> So I can't comment on this part of the patch.

As it turns out I was just looking at Yf and whatnot from display POV,
and I came to the conclusion that I'll change the kernel to just have a
function to return the tile width in bytes based on the cpp passed in,
and then I can simply compute tile height as 'tile_size / tile_width',
or tile size in pixels (if needed) as 'tile_width / cpp'

And what I understood about Yf (the docs are no good IME, at least the
part I was looking at) is the following:

cpp w_bytes w_pixels h  aspect
1   64  64   64 1
2   128 64   32 2
4   128 32   32 1
8   256 32   16 2
16  256 16   16 1

So all you really need to know is cpp and w_bytes and the rest can all
be computed as needed.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i915: fixing driver crashes if too few vertices are submitted

2015-09-09 Thread Ville Syrjälä
On Wed, Sep 09, 2015 at 11:53:54AM -0700, Ian Romanick wrote:
> On 09/09/2015 11:16 AM, Marius Predut wrote:
> > Comparison with a signed expression and unsigned value
> > is converted to unsigned value, reason for minus value is interpreted
> > as a big unsigned value. For this case the "for" loop
> > is going into unexpected behavior.
> > 
> > v1:Brian Paul: code style fix.
> 
> I don't think you really did...
> 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109
> > Signed-off-by: Marius Predut <marius.pre...@intel.com>
> > ---
> >  src/mesa/tnl_dd/t_dd_dmatmp.h | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
> > index 7be3954..79de224 100644
> > --- a/src/mesa/tnl_dd/t_dd_dmatmp.h
> > +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
> > @@ -627,6 +627,9 @@ static void TAG(render_quads_verts)( struct gl_context 
> > *ctx,
> >LOCAL_VARS;
> >GLuint j;
> >  
> > +  if(count % 4 != 0)
>^
> ...because I'm quite sure Brian's code had a space here, per Mesa's
> coding standards.
> 
> Also, this change is incorrect.  If an application does
> glDrawArrays(GL_QUADS, 0, (n * 4) + 3), this change will cause zero
> quads to be drawn when n quads should be drawn.

Due to the other check added to validate_render() we would never even
get here since we would have dropped off the fast path entrirely. Which
is not a nice thing to do simply due to an extra vertex hanging around
somewhere.

> 
> Page 18 (page 32 of the PDF) of the OpenGL 2.1 spec says:
> 
> "The total number of vertices between Begin and End is 4n + k,
> where 0 ≤ k ≤ 3; if k is not zero, the final k vertices are
> ignored."
> 
> We probably don't have a piglit test for this scenario, so one should be
> added.  You can CC me on that patch. :)
> 
> I think the correct change is to trim count such that (count % 4) == 0.
>  If the modified value of count is zero, bail out.  With that change,
> the other hunk (below) is unnecessary.

I would suggest having one function that gets passed the primitive and
vertex count and have it return the trimmed vertex count. That could
then be called from both intel_run_render() and and TAG(validate_render())
to skip any extra vertices. That way none of the actual render functions
in t_dd_dmatmp.h need worry about any extra vertices.

> 
> > +  return;
> > +
> >INIT(GL_TRIANGLES);
> >  
> >for (j = start; j < count-3; j += 4) {
> > @@ -1248,7 +1251,7 @@ static GLboolean TAG(validate_render)( struct 
> > gl_context *ctx,
> > ok = (GLint) count < GET_SUBSEQUENT_VB_MAX_ELTS();
> >  }
> >  else {
> > -   ok = HAVE_TRIANGLES; /* flatshading is ok. */
> > +   ok = HAVE_TRIANGLES && (count % 4 == 0); /* flatshading is ok. */
> >  }
> >  break;
> >default:
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH resend 7/7] i965: Disable HW-binding tables on batch finish for Broadwell

2015-06-03 Thread Ville Syrjälä
On Wed, Jun 03, 2015 at 10:05:25AM +0300, Abdiel Janulgue wrote:
 
 On 06/02/2015 08:28 PM, Kenneth Graunke wrote:
  On Tuesday, June 02, 2015 03:23:35 PM Abdiel Janulgue wrote:
 
  On 06/02/2015 09:31 AM, Kenneth Graunke wrote:
  On Monday, June 01, 2015 03:14:30 PM Abdiel Janulgue wrote:
  This is needed since kernel doesn't support RS context save and
  restore on BDW yet. So manually disable hw-generated binding tables
  when done using it in the batch. Otherwise the GPU would no longer
  accept software binding tables submitted by other clients including
  but not limited to the Xorg driver.
 
  Signed-off-by: Abdiel Janulgue abdiel.janul...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/intel_batchbuffer.c | 11 +++
   src/mesa/drivers/dri/i965/intel_batchbuffer.h |  3 ++-
   2 files changed, 13 insertions(+), 1 deletion(-)
 
  This seems fairly awful.  The kernel should prevent userspace from
  breaking other userspace...in the hardware context world, this really
  doesn't feel like our job.
 
  Why didn't you just update your kernel patch for Broadwell?  i.e. make
 
  drm/i915: Enable Resource Streamer state save/restore in HSW
 
  do:
  +   if (IS_HASWELL(ring-dev) || INTEL_INFO(ring-dev)-gen = 8)
 
  instead of:
  +   if (IS_HASWELL(ring-dev))
 
  It looks like the MI_SET_CONTEXT RS save/restore bits you used on
  Haswell still exist on Broadwell.  Do they not work or something?
 
 
 
  I was hoping to have a follow-up for GEN8 as well once the initial
  kernel patches get merged  :)
  
  We should do it all at once - it should be trivial to support both.
  
 
 I put in the necessary MI_SET_CONTEXT bits for GEN8 and tried it out
 this morning. Unfortunately, it doesn't work and hung my system.
 
 I guess the reason for this is that MI_SET_CONTEXT is not used in GEN8
 anymore? I found these comments and quoting from the docs in intel_lrc.c:
 
 Instead of the legacy MI_SET_CONTEXT, the way you tell the GPU to
 switch to a contexts is via a context execution list, ergo Execlists.
 
 Unfortunately, I'm not that familiar with the execlist implementation in
 the kernel. I have a hunch that I have to insert somewhere in the global
 default context a state that says hw-generated binding tables are
 disabled but I'm not quite sure where to put it, probably need to study
 it a bit.

I suppose you need to at least set the RS context enable bit in the
RING_CONTEXT_CONTROL register (in populate_lr_context()).

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] i915: Enable intel_render path for points

2015-05-22 Thread Ville Syrjälä
On Thu, May 21, 2015 at 09:14:03PM +0300, Ville Syrjälä wrote:
 On Fri, May 15, 2015 at 12:18:11PM -0700, Ian Romanick wrote:
  There are some really twitchy tests in ES1 (and possibly ES2)
  conformance related to this.  Do any of those tests change with this commit?
 
 I did run some ES1 conformnce tests, but the branches in the repo
 were not very clear so I'm not sure if I ran the right thing (looks
 like I used the gles1 branch and managed to build something that
 at least runs).
 
 No changes in the results on 855 or PNV between my branch and the
 baseline AFAICS.
 
 I'll see if I can get the ES2 tests built as well, and run them on
 PNV.

OK, I managed to run the ES2 tests on PNV, and there are no changes
in the results.

 
  
  On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
   From: Ville Syrjälä ville.syrj...@linux.intel.com
   
   The sub-pixel adjustment for points was killed off in
commit 60d762aa625095a8c1f9597d8530bb5a6fa61b4c
Author: Xiang, Haihao haihao.xi...@intel.com
Date:   Wed Jan 2 11:38:51 2008 +0800
   
   i915: Needn't adjust pixel centers. fix #12944
   
   so if we don't need it in intel_tris.c we don't need it in
   intel_render.c either, which means we can allow
   intel_render.c to render points.
   
   Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
   ---
src/mesa/drivers/dri/i915/intel_render.c | 8 +++-
1 file changed, 3 insertions(+), 5 deletions(-)
   
   diff --git a/src/mesa/drivers/dri/i915/intel_render.c 
   b/src/mesa/drivers/dri/i915/intel_render.c
   index 65ecd05..ef1c718 100644
   --- a/src/mesa/drivers/dri/i915/intel_render.c
   +++ b/src/mesa/drivers/dri/i915/intel_render.c
   @@ -54,9 +54,7 @@
 * dma buffers.  Use strip/fan hardware primitives where possible.
 * Try to simulate missing primitives with indexed vertices.
 */
   -#define HAVE_POINTS  0  /* Has it, but can't use because 
   subpixel has to
   - * be adjusted for points on the 
   INTEL/I845G
   - */
   +#define HAVE_POINTS  1
#define HAVE_LINES   1
#define HAVE_LINE_STRIPS 1
#define HAVE_TRIANGLES   1
   @@ -70,7 +68,7 @@
#define HAVE_ELTS0

static const uint32_t hw_prim[GL_POLYGON + 1] = {
   -   [GL_POINTS] = 0,
   +   [GL_POINTS] = PRIM3D_POINTLIST,
   [GL_LINES ] = PRIM3D_LINELIST,
   [GL_LINE_LOOP] = PRIM3D_LINESTRIP,
   [GL_LINE_STRIP] = PRIM3D_LINESTRIP,
   @@ -96,7 +94,7 @@ static const GLenum reduced_prim[GL_POLYGON + 1] = {
};

static const int scale_prim[GL_POLYGON + 1] = {
   -   [GL_POINTS] = 0, /* fallback case */
   +   [GL_POINTS] = 1,
   [GL_LINES] = 1,
   [GL_LINE_LOOP] = 2,
   [GL_LINE_STRIP] = 2,
   
 
 -- 
 Ville Syrjälä
 Intel OTC

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/20] t_dd_dmatmp: Kill the paths rendering quads/quad strips via indexed vertices

2015-05-22 Thread Ville Syrjälä
On Fri, May 15, 2015 at 12:06:54PM -0700, Ian Romanick wrote:
 On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  No driver supports elts currently, and these make the validate_render
  code a bit hard to follow. Just kill them.
 
 It looks like both r200_tcl.c and radeon_tcl.c define HAVE_TRI_STRIPS
 and HAVE_ELTS, but I guess they're using t_dd_dmatmp2.h.  If you add a
 
 #if defined HAVE_ELTS
 #error HAVE_ELTS support is removed
 #endif
 
 do radeon and r200 still build?

Yes, they only emit elts via t_dd_dmatmp2.h as you said.

And do note that I didn't remove elt support entirely, I just killed
off the more tricky code that emits elts from the vert paths. So the
code can still handle elts just fine if VB-Elts is present.

 
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/tnl_dd/t_dd_dmatmp.h | 133 
  ++
   1 file changed, 5 insertions(+), 128 deletions(-)
  
  diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
  index 52ea2bf..5ea2d31 100644
  --- a/src/mesa/tnl_dd/t_dd_dmatmp.h
  +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
  @@ -36,7 +36,7 @@
* Produces code for both inline triangles and indexed triangles.
* Where various primitive types are unaccelerated by hardware, the
* code attempts to fallback to other primitive types (quadstrips to
  - * tristrips, lineloops to linestrips), or to indexed vertices.
  + * tristrips, lineloops to linestrips).
*/
   
   #if !defined(HAVE_TRIANGLES)
  @@ -444,65 +444,6 @@ static void TAG(render_quad_strip_verts)( struct 
  gl_context *ctx,
 }
   
 FLUSH();
  -
  -   } else if (HAVE_TRI_STRIPS  
  - ctx-Light.ShadeModel == GL_FLAT 
  - TNL_CONTEXT(ctx)-vb.AttribPtr[_TNL_ATTRIB_COLOR0]-stride) {
 
 I don't think removing this whole block is right because !HAVE_ELTS is
 an error case... but I guess we've never hit that error case since
 HAVE_ELTS is always zero.

There are no real error cases here, just dead code. validate_render() is
supposed to make sure we never call these functions if the code can't
actually render the primitives. The fprintf()+return branches should
really just contain assert(0) or equivalent.

 
  -  if (HAVE_ELTS) {
  -LOCAL_VARS;
  -int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS();
  -int currentsz;
  -GLuint j, nr;
  -
  - EMIT_INDEXED_VERTS( ctx, start, count );
  -
  -/* Simulate flat-shaded quadstrips using indexed vertices:
  - */
  -ELT_INIT( GL_TRIANGLES );
  -
  -currentsz = GET_CURRENT_VB_MAX_ELTS();
  -
  -/* Emit whole number of quads in total, and in each buffer.
  - */
  -dmasz -= dmasz  1;
  -count -= (count-start)  1;
  -currentsz -= currentsz  1;
  -
  -if (currentsz  12)
  -   currentsz = dmasz;
  -
  -currentsz = currentsz/6*2;
  -dmasz = dmasz/6*2;
  -
  -for (j = start; j + 3  count; j += nr - 2 ) {
  -   nr = MIN2( currentsz, count - j );
  -   if (nr = 4) {
  -  GLint quads = (nr/2)-1;
  -  GLint i;
  -  ELTS_VARS( ALLOC_ELTS( quads*6 ) );
  -
  -  for ( i = j-start ; i  j-start+quads*2 ; i+=2 ) {
  - EMIT_TWO_ELTS( 0, (i+0), (i+1) );
  - EMIT_TWO_ELTS( 2, (i+2), (i+1) );
  - EMIT_TWO_ELTS( 4, (i+3), (i+2) );
  - INCR_ELTS( 6 );
  -  }
  -
  -  FLUSH();
  -   }
  -   currentsz = dmasz;
  -}
  -
  -RELEASE_ELT_VERTS();
  -FLUSH();
  -  }
  -  else {
  -/* Vertices won't fit in a single buffer or elts not
  - * available - should never happen.
  - */
  -fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__);
  -return;
  -  }
  }
  else if (HAVE_TRI_STRIPS) {
 LOCAL_VARS;
  @@ -568,57 +509,6 @@ static void TAG(render_quads_verts)( struct gl_context 
  *ctx,
currentsz = dmasz;
 }
  }
  -   else if (HAVE_ELTS) {
  -  /* Hardware doesn't have a quad primitive type -- try to
  -   * simulate it using indexed vertices and the triangle
  -   * primitive:
  -   */
  -  LOCAL_VARS;
  -  int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS();
  -  int currentsz;
  -  GLuint j, nr;
  -
  -  EMIT_INDEXED_VERTS( ctx, start, count );
  -
  -  FLUSH();
  -  ELT_INIT( GL_TRIANGLES );
  -  currentsz = GET_CURRENT_VB_MAX_ELTS();
  -
  -  /* Emit whole number of quads in total, and in each buffer.
  -   */
  -  dmasz -= dmasz  3;
  -  count -= (count-start)  3;
  -  currentsz -= currentsz  3;
  -
  -  /* Adjust for rendering as triangles:
  -   */
  -  currentsz = currentsz/6*4;
  -  dmasz = dmasz/6*4;
  -
  -  if (currentsz  8)
  -currentsz = dmasz;
  -
  -  for (j = start; j  count; j += nr ) {
  -nr = MIN2( currentsz, count - j );
  -if (nr = 4) {
  -   GLint quads = nr/4

Re: [Mesa-dev] [PATCH 15/20] i915: Enable intel_render path for points

2015-05-21 Thread Ville Syrjälä
On Fri, May 15, 2015 at 12:18:11PM -0700, Ian Romanick wrote:
 There are some really twitchy tests in ES1 (and possibly ES2)
 conformance related to this.  Do any of those tests change with this commit?

I did run some ES1 conformnce tests, but the branches in the repo
were not very clear so I'm not sure if I ran the right thing (looks
like I used the gles1 branch and managed to build something that
at least runs).

No changes in the results on 855 or PNV between my branch and the
baseline AFAICS.

I'll see if I can get the ES2 tests built as well, and run them on
PNV.

 
 On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  The sub-pixel adjustment for points was killed off in
   commit 60d762aa625095a8c1f9597d8530bb5a6fa61b4c
   Author: Xiang, Haihao haihao.xi...@intel.com
   Date:   Wed Jan 2 11:38:51 2008 +0800
  
  i915: Needn't adjust pixel centers. fix #12944
  
  so if we don't need it in intel_tris.c we don't need it in
  intel_render.c either, which means we can allow
  intel_render.c to render points.
  
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/drivers/dri/i915/intel_render.c | 8 +++-
   1 file changed, 3 insertions(+), 5 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i915/intel_render.c 
  b/src/mesa/drivers/dri/i915/intel_render.c
  index 65ecd05..ef1c718 100644
  --- a/src/mesa/drivers/dri/i915/intel_render.c
  +++ b/src/mesa/drivers/dri/i915/intel_render.c
  @@ -54,9 +54,7 @@
* dma buffers.  Use strip/fan hardware primitives where possible.
* Try to simulate missing primitives with indexed vertices.
*/
  -#define HAVE_POINTS  0  /* Has it, but can't use because subpixel 
  has to
  - * be adjusted for points on the 
  INTEL/I845G
  - */
  +#define HAVE_POINTS  1
   #define HAVE_LINES   1
   #define HAVE_LINE_STRIPS 1
   #define HAVE_TRIANGLES   1
  @@ -70,7 +68,7 @@
   #define HAVE_ELTS0
   
   static const uint32_t hw_prim[GL_POLYGON + 1] = {
  -   [GL_POINTS] = 0,
  +   [GL_POINTS] = PRIM3D_POINTLIST,
  [GL_LINES ] = PRIM3D_LINELIST,
  [GL_LINE_LOOP] = PRIM3D_LINESTRIP,
  [GL_LINE_STRIP] = PRIM3D_LINESTRIP,
  @@ -96,7 +94,7 @@ static const GLenum reduced_prim[GL_POLYGON + 1] = {
   };
   
   static const int scale_prim[GL_POLYGON + 1] = {
  -   [GL_POINTS] = 0, /* fallback case */
  +   [GL_POINTS] = 1,
  [GL_LINES] = 1,
  [GL_LINE_LOOP] = 2,
  [GL_LINE_STRIP] = 2,
  

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/20] t_dd_dmatmp: Allow flat shaded polygons with tri fans

2015-05-21 Thread Ville Syrjälä
On Fri, May 15, 2015 at 12:07:54PM -0700, Ian Romanick wrote:
 On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  We can allow rendering flat shaded polygons using tri fans if we check
  the provoking vertex convention.
 
 This sounds reasonable since it matches the DX behavior.  Is there a
 piglit test that would hit this?

Dunno. And in fact we won't hit it with i915 sine we HAVE_POLYGONS.
Failed to realize that before I cooked up the patch :)

 
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/tnl_dd/t_dd_dmatmp.h | 12 +---
   1 file changed, 9 insertions(+), 3 deletions(-)
  
  diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
  index 5ea2d31..3ed4a98 100644
  --- a/src/mesa/tnl_dd/t_dd_dmatmp.h
  +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
  @@ -406,7 +406,9 @@ static void TAG(render_poly_verts)( struct gl_context 
  *ctx,
   
 FLUSH();
  }
  -   else if (HAVE_TRI_FANS  ctx-Light.ShadeModel == GL_SMOOTH) {
  +   else if (HAVE_TRI_FANS 
  +   (ctx-Light.ShadeModel == GL_SMOOTH ||
  +ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION_EXT)) {
 TAG(render_tri_fan_verts)( ctx, start, count, flags );
  } else {
 fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__);
  @@ -885,7 +887,9 @@ static void TAG(render_poly_elts)( struct gl_context 
  *ctx,
   FLUSH();
   currentsz = dmasz;
 }
  -   } else if (HAVE_TRI_FANS  ctx-Light.ShadeModel == GL_SMOOTH) {
  +   } else if (HAVE_TRI_FANS 
  + (ctx-Light.ShadeModel == GL_SMOOTH ||
  +  ctx-Light.ProvokingVertex == GL_FIRST_VERTEX_CONVENTION_EXT)) {
 TAG(render_tri_fan_verts)( ctx, start, count, flags );
  } else {
 fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__);
  @@ -1109,7 +1113,9 @@ static GLboolean TAG(validate_render)( struct 
  gl_context *ctx,
  ok = GL_TRUE;
   }
   else {
  -   ok = (HAVE_TRI_FANS  ctx-Light.ShadeModel == GL_SMOOTH);
  +   ok = (HAVE_TRI_FANS 
  + (ctx-Light.ShadeModel == GL_SMOOTH ||
  +  ctx-Light.ProvokingVertex == 
  GL_FIRST_VERTEX_CONVENTION_EXT));
}
   break;
 case GL_QUAD_STRIP:
  

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/20] t_dd_dmatmp: Disallow flat shading when rendering quad strips via tri strips

2015-05-21 Thread Ville Syrjälä
On Fri, May 15, 2015 at 12:08:33PM -0700, Ian Romanick wrote:
 On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  When rendering quad strips via tri strips we can't get the provoking
  vertex right, so disallow flat shading.
 
 Same comments as for patch 2.

I'm not sure about piglit, but there's bunch of stuff in mesa-demos
that hit this stuff.

To clarify a bit, this series does fix the piglit provoking vertex
test on 855, but I'm unsure which patches precisely are the key to
that since a lot of them try to fix various provoking vertex issues.
I was too lazy to reverse bisect my own patches.

 
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/tnl_dd/t_dd_dmatmp.h | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
  index 3ed4a98..f56b0aa 100644
  --- a/src/mesa/tnl_dd/t_dd_dmatmp.h
  +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
  @@ -447,7 +447,7 @@ static void TAG(render_quad_strip_verts)( struct 
  gl_context *ctx,
   
 FLUSH();
  }
  -   else if (HAVE_TRI_STRIPS) {
  +   else if (HAVE_TRI_STRIPS  ctx-Light.ShadeModel == GL_SMOOTH) {
 LOCAL_VARS;
 int dmasz = GET_SUBSEQUENT_VB_MAX_VERTS();
 int currentsz;
  @@ -1124,7 +1124,7 @@ static GLboolean TAG(validate_render)( struct 
  gl_context *ctx,
   } else if (HAVE_QUAD_STRIPS) {
  ok = GL_TRUE;
   } else {
  -   ok = HAVE_TRI_STRIPS;
  +   ok = (HAVE_TRI_STRIPS  ctx-Light.ShadeModel == GL_SMOOTH);
   }
   break;
 case GL_QUADS:
  

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add marketing names for CHV

2015-04-16 Thread Ville Syrjälä
On Thu, Apr 16, 2015 at 11:20:01AM -0700, Kenneth Graunke wrote:
 On Thursday, April 16, 2015 09:00:46 PM ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  All CHV devices will be branded as Intel(r) HD Graphics.
  
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   include/pci_ids/i965_pci_ids.h | 8 
   1 file changed, 4 insertions(+), 4 deletions(-)
  
  diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
  index 3e3e8fe..8d757aa 100644
  --- a/include/pci_ids/i965_pci_ids.h
  +++ b/include/pci_ids/i965_pci_ids.h
  @@ -124,7 +124,7 @@ CHIPSET(0x1921, skl_gt2, Intel(R) Skylake ULT GT2F)
   CHIPSET(0x1926, skl_gt3, Intel(R) Skylake ULT GT3)
   CHIPSET(0x192A, skl_gt3, Intel(R) Skylake SRV GT3)
   CHIPSET(0x192B, skl_gt3, Intel(R) Skylake Halo GT3)
  -CHIPSET(0x22B0, chv, Intel(R) Cherryview)
  -CHIPSET(0x22B1, chv, Intel(R) Cherryview)
  -CHIPSET(0x22B2, chv, Intel(R) Cherryview)
  -CHIPSET(0x22B3, chv, Intel(R) Cherryview)
  +CHIPSET(0x22B0, chv, Intel(R) HD Graphics (Cherryview))
  +CHIPSET(0x22B1, chv, Intel(R) HD Graphics (Cherryview))
  +CHIPSET(0x22B2, chv, Intel(R) HD Graphics (Cherryview))
  +CHIPSET(0x22B3, chv, Intel(R) HD Graphics (Cherryview))
  
 
 Thank you for leaving Cherryview here - I like to have that there so
 it's unique and identifiable.

Yep, that was my thinking as well. Otherwise we'd have just a bazillion
HD Graphics entries and everyone would be even more confused.

 
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Thanks.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] i915: Pile of fixes and cleanups

2015-04-13 Thread Ville Syrjälä
On Mon, Mar 23, 2015 at 02:47:16PM +0200, ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 I've had some of these i915 patches lying around for half a year or more,
 so I figured it's time to post them.
 
 This series fixes rendering problems in glxgears and supertuxkart. It
 also fixes a few piglit tests (provoking vertex, and a few crashers).
 No piglit regressions on 855.
 
 Summary of the changes:
  * provoking vertex fixes
  * gen2 user fbo culling fix
  * some buffer handling fixes ported over from i965
  * gen3 fragment shader texcoord vs. varying fix (already posted before)
  * random point/line rendering stuff
  * a bit of polish here and there

Doesn't look like many people are interested in reading through these.

Let me put it the other way: Anyone opposed to me just pushing these?

 
 Ville Syrjälä (20):
   t_dd_dmatmp: Kill the paths rendering quads/quad strips via indexed
 vertices
   t_dd_dmatmp: Allow flat shaded polygons with tri fans
   t_dd_dmatmp: Disallow flat shading when rendering quad strips via tri
 strips
   t_dd_dmatmp: Check provoking vertex convention when rendering quads
   t_dd_dmatmp: Call render_tri_fan_elts from render_poly_elts
   t_dd_dmatmp: Fix render_quad_strip_elts
   t_dd_dmatmp: Make the render_tab[]s const
   i915: Fix collision between I830_UPLOAD_RASTER_RULES and
 I830_UPLOAD_TEX(0)
   i915: Handle provoking vertex in intelFastRenderClippedPoly()
   i915: Fix t_vb_rendertmp.h's provoking vertex handywork
   i915: Use _tnl_RenderClippedPolygon and _tnl_RenderClippedLine
   i915: Make hw_prim[] const
   i915: Use c99 initializers for primitive arrays
   i915: Use COPY_DWORDS for points
   i915: Enable intel_render path for points
   i915: Adjust line size limits
   i915: Remember to call intel_prepare_render() before blitting
   i915: Drop broken front_buffer_reading/drawing optimization
   i915: Fix culling with user fbos on gen2
   i915: Fix texcoord vs. varying collision in fragment programs
 
  src/mesa/drivers/dri/i915/i830_context.h  |   8 +-
  src/mesa/drivers/dri/i915/i830_state.c|   2 +
  src/mesa/drivers/dri/i915/i915_context.h  |  14 +--
  src/mesa/drivers/dri/i915/i915_fragprog.c |  86 ++---
  src/mesa/drivers/dri/i915/intel_buffers.c |  47 ---
  src/mesa/drivers/dri/i915/intel_buffers.h |   3 +
  src/mesa/drivers/dri/i915/intel_context.c |  16 +--
  src/mesa/drivers/dri/i915/intel_context.h |  16 ---
  src/mesa/drivers/dri/i915/intel_fbo.c |   5 +
  src/mesa/drivers/dri/i915/intel_render.c  |  66 +-
  src/mesa/drivers/dri/i915/intel_tris.c| 114 ++---
  src/mesa/tnl_dd/t_dd_dmatmp.h | 196 
 +++---
  12 files changed, 266 insertions(+), 307 deletions(-)
 
 -- 
 2.0.5

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix URB size for CHV

2015-03-06 Thread Ville Syrjälä
On Thu, Mar 05, 2015 at 01:48:29PM -0800, Kenneth Graunke wrote:
 On Thursday, March 05, 2015 07:41:29 PM Ville Syrjälä wrote:
  On Fri, Jan 23, 2015 at 12:12:56PM +0200, ville.syrj...@linux.intel.com 
  wrote:
   From: Ville Syrjälä ville.syrj...@linux.intel.com
   
   Increase the device info .urb.size for CHV to match the default URB
   size (192kB).
   
   Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Ping?
 
 Oh, sorry!  I thought I'd reviewed this.  It does indeed appear to be
 192kB.
 
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 
 Have you tested it?  Assuming it doesn't explode, feel free to push
 this.  Thanks for catching the mistake!

Yep, been running with this patch for ~6 months or so ;)

Pushed now. Thanks.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix build errors on x86+sse

2015-03-05 Thread Ville Syrjälä
On Thu, Mar 05, 2015 at 11:40:16AM -0800, Matt Turner wrote:
 On Thu, Mar 5, 2015 at 11:32 AM,  ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 I committed a patch a little while ago from Mark to fix this (as
 commit 5f9ee6a0).

Yep see it now. I must have fetched more or less just before you
pushed it. But problem solved in any case, which is what matters.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix URB size for CHV

2015-03-05 Thread Ville Syrjälä
On Fri, Jan 23, 2015 at 12:12:56PM +0200, ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 Increase the device info .urb.size for CHV to match the default URB
 size (192kB).
 
 Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com

Ping?

 ---
  src/mesa/drivers/dri/i965/brw_device_info.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
 b/src/mesa/drivers/dri/i965/brw_device_info.c
 index 3c3c564..ba65584 100644
 --- a/src/mesa/drivers/dri/i965/brw_device_info.c
 +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
 @@ -241,7 +241,7 @@ static const struct brw_device_info brw_device_info_chv = 
 {
 .max_gs_threads = 80,
 .max_wm_threads = 128,
 .urb = {
 -  .size = 128,
 +  .size = 192,
.min_vs_entries = 34,
.max_vs_entries = 640,
.max_gs_entries = 256,
 -- 
 2.0.5

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Fix URB size for gen8

2015-01-23 Thread Ville Syrjälä
On Wed, Jan 21, 2015 at 12:51:02PM -0800, Kenneth Graunke wrote:
 On Wednesday, January 21, 2015 08:17:36 PM ville.syrj...@linux.intel.com 
 wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Increase the device info .urb.size for BDW GT3 and CHV to match the
  default URB size for each.
  
  Also add all missing platforms (BYT,BDW,CHV) to the comment describing
  the default URB size in gen7_urb.c.
  
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
   src/mesa/drivers/dri/i965/gen7_urb.c| 5 -
   2 files changed, 6 insertions(+), 3 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
  b/src/mesa/drivers/dri/i965/brw_device_info.c
  index bdef42b..d0b9e05 100644
  --- a/src/mesa/drivers/dri/i965/brw_device_info.c
  +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
  @@ -226,7 +226,7 @@ static const struct brw_device_info 
  brw_device_info_bdw_gt3 = {
  GEN8_FEATURES, .gt = 3,
  .max_wm_threads = 384,
  .urb = {
  -  .size = 384,
  +  .size = 768,
 .min_vs_entries = 64,
 .max_vs_entries = 2560,
 .max_gs_entries = 960,
  @@ -243,7 +243,7 @@ static const struct brw_device_info brw_device_info_chv 
  = {
  .max_gs_threads = 80,
  .max_wm_threads = 128,
  .urb = {
  -  .size = 128,
  +  .size = 192,
 .min_vs_entries = 34,
 .max_vs_entries = 640,
 .max_gs_entries = 256,
  diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
  b/src/mesa/drivers/dri/i965/gen7_urb.c
  index 201f42e..f90d6e3 100644
  --- a/src/mesa/drivers/dri/i965/gen7_urb.c
  +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
  @@ -50,9 +50,12 @@
* Currently we split the constant buffer space evenly among whatever 
  stages
* are active.  This is probably not ideal, but simple.
*
  - * Ivybridge GT1 and Haswell GT1 have 128kB of URB space.
  + * Ivybridge GT1, Baytrail and Haswell GT1 have 128kB of URB space.
* Ivybridge GT2 and Haswell GT2 have 256kB of URB space.
* Haswell GT3 has 512kB of URB space.
  + * Broadwell GT1 and Cherryview have 192kB of URB space.
  + * Broadwell GT2 has 384kB of URB space.
  + * Broadwell GT3 has 768kB of URB space.
*
* See Volume 2a: 3D Pipeline, section 1.8, Volume 1b: Configurations,
* and the documentation for 3DSTATE_PUSH_CONSTANT_ALLOC_xS.
  
 
 Have you tested this?  I tried 768k on Broadwell GT3 a while back and got
 no end of GPU hangs.  Which is odd, because it should be the correct value.

I was just reading the spec a bit more and I saw this note in 3DSTATE_URB_VS:
The offset and size should be programmed as if there is only one slice
 enabled. Hardware will grow the size based on the slice configuration.
 The hardware supports up to 1024KB of URB space so any slice and max urb
 size configuration that goes over that limit is not allowed and will
 cause corruption. Refer to the L3 allocation and programming guide for
 valid URB configurations.

So I guess that explains it. And then 384 is the correct value for GT2
and GT3. I'm not quite sure how that interacts with the push constant
offset/size though...

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Fix URB size for gen8

2015-01-23 Thread Ville Syrjälä
On Wed, Jan 21, 2015 at 12:51:02PM -0800, Kenneth Graunke wrote:
 On Wednesday, January 21, 2015 08:17:36 PM ville.syrj...@linux.intel.com 
 wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Increase the device info .urb.size for BDW GT3 and CHV to match the
  default URB size for each.
  
  Also add all missing platforms (BYT,BDW,CHV) to the comment describing
  the default URB size in gen7_urb.c.
  
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
   src/mesa/drivers/dri/i965/gen7_urb.c| 5 -
   2 files changed, 6 insertions(+), 3 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
  b/src/mesa/drivers/dri/i965/brw_device_info.c
  index bdef42b..d0b9e05 100644
  --- a/src/mesa/drivers/dri/i965/brw_device_info.c
  +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
  @@ -226,7 +226,7 @@ static const struct brw_device_info 
  brw_device_info_bdw_gt3 = {
  GEN8_FEATURES, .gt = 3,
  .max_wm_threads = 384,
  .urb = {
  -  .size = 384,
  +  .size = 768,
 .min_vs_entries = 64,
 .max_vs_entries = 2560,
 .max_gs_entries = 960,
  @@ -243,7 +243,7 @@ static const struct brw_device_info brw_device_info_chv 
  = {
  .max_gs_threads = 80,
  .max_wm_threads = 128,
  .urb = {
  -  .size = 128,
  +  .size = 192,
 .min_vs_entries = 34,
 .max_vs_entries = 640,
 .max_gs_entries = 256,
  diff --git a/src/mesa/drivers/dri/i965/gen7_urb.c 
  b/src/mesa/drivers/dri/i965/gen7_urb.c
  index 201f42e..f90d6e3 100644
  --- a/src/mesa/drivers/dri/i965/gen7_urb.c
  +++ b/src/mesa/drivers/dri/i965/gen7_urb.c
  @@ -50,9 +50,12 @@
* Currently we split the constant buffer space evenly among whatever 
  stages
* are active.  This is probably not ideal, but simple.
*
  - * Ivybridge GT1 and Haswell GT1 have 128kB of URB space.
  + * Ivybridge GT1, Baytrail and Haswell GT1 have 128kB of URB space.
* Ivybridge GT2 and Haswell GT2 have 256kB of URB space.
* Haswell GT3 has 512kB of URB space.
  + * Broadwell GT1 and Cherryview have 192kB of URB space.
  + * Broadwell GT2 has 384kB of URB space.
  + * Broadwell GT3 has 768kB of URB space.
*
* See Volume 2a: 3D Pipeline, section 1.8, Volume 1b: Configurations,
* and the documentation for 3DSTATE_PUSH_CONSTANT_ALLOC_xS.
  
 
 Have you tested this?  I tried 768k on Broadwell GT3 a while back and got
 no end of GPU hangs.  Which is odd, because it should be the correct value.

OK, since the BDW stuff has potential issues, I've split the CHV stuff
into separate patches. I already pushed the max_wm_threads and
min_vs_entries patches with your r-b, and re-posted the URB size patch
with just CHV changed this time.

I'll leave the BDW bits to someone else who has enough BDW hardware
around to test and figure out what works and what doesn't.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915: Improve HiZ throughput on Cherryview.

2015-01-12 Thread Ville Syrjälä
On Sat, Jan 10, 2015 at 06:02:22PM -0800, Kenneth Graunke wrote:
 Found by reading the HIZ_CHICKEN documentation.
 
 Improves performance in a HiZ microbenchmark by around 50%.
 Improves performance in OglZBuffer by around 18%.
 
 Thanks to Chris Wilson for helping me figure out where to put this.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  drivers/gpu/drm/i915/i915_reg.h | 3 +++
  drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +++
  2 files changed, 6 insertions(+)
 
 diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
 index 0f32fd1a..a39bb03 100644
 --- a/drivers/gpu/drm/i915/i915_reg.h
 +++ b/drivers/gpu/drm/i915/i915_reg.h
 @@ -5202,6 +5202,9 @@ enum punit_power_well {
  #define COMMON_SLICE_CHICKEN20x7014
  # define GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE(10)
  
 +#define HIZ_CHICKEN  0x7018
 +# define CHV_HZ_8X8_MODE_IN_1X   (115)
 +
  #define GEN7_L3SQCREG1   0xB010
  #define  VLV_B0_WA_L3SQCREG1_VALUE   0x00D3
  
 diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
 b/drivers/gpu/drm/i915/intel_ringbuffer.c
 index 12a36f0..dabc1d8 100644
 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
 +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
 @@ -836,6 +836,9 @@ static int chv_init_workarounds(struct intel_engine_cs 
 *ring)
 HDC_FORCE_NON_COHERENT |
 HDC_DONOT_FETCH_MEM_WHEN_MASKED);
  
 + /* Improve HiZ throughput on CHV. */
 + WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
 +

Nothing much in bspec about this bit. Can't see anything suspicious
in the w/a database either. So I guess we can assume it's safe.

Reviewed-by: Ville Syrjälä ville.syrj...@linux.intel.com

But I do wonder a bit if there's any relationship with the WIZ hashing
mode. Looks like we've not brought the 16x4 WIZ hashing mode change
over to CHV (or BYT for that matter), so I guess we're still using the
default 8x8 on these platforms. Might be interesting to see if there
are any gains to be had by changing it.

   return 0;
  }
  
 -- 
 2.2.1
 
 ___
 Intel-gfx mailing list
 intel-...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915: Enable the HiZ RAW Stall Optimization on Gen8.

2015-01-12 Thread Ville Syrjälä
 a VCS register. Sadly I've not found any documentation for !RCS
power context, but I'm assuming every engine has a power context of some sort.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] drm/i915: Enable the HiZ RAW Stall Optimization on Gen8.

2015-01-12 Thread Ville Syrjälä
On Sat, Jan 10, 2015 at 06:44:49PM -0800, Kenneth Graunke wrote:
 This is an important optimization for avoiding read-after-write (RAW)
 stalls in the HiZ buffer.  Certain workloads would run very slowly with
 HiZ enabled, but run much faster with the hiz=false driconf option.
 With this patch, they run at full speed even with HiZ.
 
 Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e
 (Iris Pro 6200).
 
 Thanks to Jesse Barnes for finding this missing bit!
 Thanks to Chris Wilson for helping me find where to set it.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/intel_ringbuffer.c | 15 +++
  1 file changed, 15 insertions(+)
 
 Here's an alternate patch which implements the workaround in the kernel
 instead of Mesa.  It's probably better to do it there, since the kernel
 does it on Haswell already.
 
 diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
 b/drivers/gpu/drm/i915/intel_ringbuffer.c
 index dabc1d8..23020d6 100644
 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
 +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
 @@ -796,6 +796,16 @@ static int bdw_init_workarounds(struct intel_engine_cs 
 *ring)
 HDC_DONOT_FETCH_MEM_WHEN_MASKED |
 (IS_BDW_GT3(dev) ? HDC_FENCE_DEST_SLM_DISABLE : 0));
  
 + /* From the Haswell PRM, Command Reference: Registers, CACHE_MODE_0:
 +  * The Hierarchical Z RAW Stall Optimization allows non-overlapping
 +  *  polygons in the same 8x4 pixel/sample area to be processed without
 +  *  stalling waiting for the earlier ones to write to Hierarchical Z
 +  *  buffer.
 +  *
 +  * This optimization is off by default for Broadwell; turn it on.
 +  */
 + WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
 +
   /* Wa4x4STCOptimizationDisable:bdw */
   WA_SET_BIT_MASKED(CACHE_MODE_1,
 GEN8_4x4_STC_OPTIMIZATION_DISABLE);
 @@ -836,6 +846,11 @@ static int chv_init_workarounds(struct intel_engine_cs 
 *ring)
 HDC_FORCE_NON_COHERENT |
 HDC_DONOT_FETCH_MEM_WHEN_MASKED);
  
 + /* According to the CACHE_MODE_0 default value documentation, some
 +  * CHV platforms disable this optimization by default.  Turn it on.
 +  */
 + WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
 +

I remember looking at this when the HSW version was done, and at the
time the docs claimed that the default value on gen8 was already good.
Looks like the docs have been updated since then. Also my BSW agrees
that the disable bit is now set by default.

I suppose we could assume that since it works on HSW, it'll work on
gen8. However, I find it a bit suspicious that the later steppings seem
to have changed the default to disable the optimization. IVB suffered
from real problems with the optimization enabled and hence the IVB patch
was reverted. IIRC Chia-I wrote some kind of test to demonstrate the
problem on IVB, so maybe you want to run it and make sure it still
works correctly on gen8. Here's the test:
http://permalink.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/35399

   /* Improve HiZ throughput on CHV. */
   WA_SET_BIT_MASKED(HIZ_CHICKEN, CHV_HZ_8X8_MODE_IN_1X);
  
 -- 
 2.2.1
 
 ___
 Intel-gfx mailing list
 intel-...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/9] i915: Only use TEXCOORDTYPE_VECTOR with cube maps on gen2

2014-11-20 Thread Ville Syrjälä
On Thu, Nov 20, 2014 at 09:59:00AM -0800, Ian Romanick wrote:
 On 08/06/2014 11:56 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Check that the target is GL_TEXTURE_CUBE_MAP before emitting
  TEXCOORDTYPE_VECTOR texture coordinates.
  
  I'm not sure if the hardware would like CARTESIAN coordinates
  with cube maps, and as I'm too lazy to find out just emit the
  VECTOR coordinates for cube maps always. For other targets use
  CARTESIAN or HOMOGENOUS depending on the number of texture
  coordinates provided.
  
  Fixes rendering of the electric background texture in chromium-bsu
  main menu. We appear to be provided with three texture coordinates
  there (I'm guessing due to the funky texture matrix rotation it does).
  So the code would decide to use TEXCOORDTYPE_VECTOR instead of
  TEXCOORDTYPE_CARTESIAN even though we're dealing with a 2D texure.
  The results weren't what one might expect.
  
  demos/cubemap still works, which hopefully indicates that this doesn't
  break things.
 
 I won't dare ask about a full piglit run on that hardware,

I did actually try to run piglit on the 830, but it always
seemed to die in some X blit batch after a while :(

There have been some recent blt TLB workaround fixes in the
kernel though, so perhaps things are better now. And if not,
I do have a 855 that ought to be a bit less fragile. So
maybe I'll give it another go just for kicks :P

 but how about
 
 bin/glean -o -v -v -v -t +texCube --quick
 
 and
 
 bin/cubemap -auto
 
 from piglit?

pass-pass on both counts.

 
 These changes seem reasonable, and assuming those tests aren't made worse,
 
 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 
 If you're excited about GEN2 bugs, wanna take a look at 79841? :)

I'm thinking it should be fixed by:

commit dafae910d4fc791ba49f20e937cb918669f42944
Author: Ville Syrjälä ville.syrj...@linux.intel.com
Date:   Thu Jul 3 15:38:07 2014 +0300

i915: Accept GL_DEPTH_STENCIL GL_DEPTH_COMPONENT formats for
renderbuffers

I'll note it in the bug.

 
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/drivers/dri/i915/i830_vtbl.c | 37 
  ++-
   1 file changed, 19 insertions(+), 18 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i915/i830_vtbl.c 
  b/src/mesa/drivers/dri/i915/i830_vtbl.c
  index 53d408b..0f22d86 100644
  --- a/src/mesa/drivers/dri/i915/i830_vtbl.c
  +++ b/src/mesa/drivers/dri/i915/i830_vtbl.c
  @@ -134,27 +134,28 @@ i830_render_start(struct intel_context *intel)
   GLuint mcs = (i830-state.Tex[i][I830_TEXREG_MCS] 
 ~TEXCOORDTYPE_MASK);
   
  -switch (sz) {
  -case 1:
  -case 2:
  -   emit = EMIT_2F;
  -   sz = 2;
  -   mcs |= TEXCOORDTYPE_CARTESIAN;
  -   break;
  -case 3:
  +if (intel-ctx.Texture.Unit[i]._Current-Target == 
  GL_TEXTURE_CUBE_MAP) {
  emit = EMIT_3F;
  sz = 3;
  mcs |= TEXCOORDTYPE_VECTOR;
  -   break;
  -case 4:
  -   emit = EMIT_3F_XYW;
  -   sz = 3;
  -   mcs |= TEXCOORDTYPE_HOMOGENEOUS;
  -   break;
  -default:
  -   continue;
  -};
  -
  +} else {
  +   switch (sz) {
  +   case 1:
  +   case 2:
  +   case 3:
  +  emit = EMIT_2F;
  +  sz = 2;
  +  mcs |= TEXCOORDTYPE_CARTESIAN;
  +  break;
  +   case 4:
  +  emit = EMIT_3F_XYW;
  +  sz = 3;
  +  mcs |= TEXCOORDTYPE_HOMOGENEOUS;
  +  break;
  +   default:
  +  continue;
  +   }
  +}
   
   EMIT_ATTR(_TNL_ATTRIB_TEX0 + i, emit, 0);
   v2 |= VRTX_TEX_SET_FMT(count, SZ_TO_HW(sz));
  

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/9] i915: Gen2 texturing fixes and a few random patches

2014-11-13 Thread Ville Syrjälä
On Wed, Aug 06, 2014 at 09:56:30PM +0300, ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 I had a few rainy days during my summer vacation so I decided to fix a
 chromnium-bsu texturing problem that was nagging me for a while now. I
 ended up fixing a few other things too that I spotted mostly from reading
 the code.
 
 The aniso vs. mip filter thing probably comes down to personal preference,
 but at least to me aniso+mip nearest looks better than trilinear. At least
 when playing the old classic glaxium :)
 
 I have no idea if the scissor patch makes any difference anywhere. I just
 caught the note in the spec and noticed we're doing it in the opposite order.
 
 The rest should be pretty clear.
 
 Ville Syrjälä (9):
   i915: Fix GL_DOT3_RGBA a bit
   i915: Use L8A8 instead of I8 to simulate A8 on gen2
   i915: Override mip filter to nearest with aniso
   i915: Accept GL_DEPTH_STENCIL GL_DEPTH_COMPONENT formats for
 renderbuffers
   i915: Kill intel_context::hw_stencil
   i915: Protect macro argument for TEXTURE_SET()
   i915: Don't call _mesa_meta_glsl_Clear() on gen2
   i915: Emit 3DSTATE_SCISSOR_RECTANGLE_0 before 3DSTATE_SCISSOR_ENABLE

I finally got around to pushing the reviewed patches from this series.
Thanks for the reviews.

   i915: Only use TEXCOORDTYPE_VECTOR with cube maps on gen2

This one is still lacking a review though, and it's actually for the
original bug I set out to fix. So I'd appreaciate if someone can take
a look at the patch.

There's also this gen3 specific patch I did that would like to get
reviewed: http://patchwork.freedesktop.org/patch/31661/

I also have to confess to having a decent pile of more vertex related
i915 patches sitting in a branch, one which actually makes glxgears
render correctly on gen2 ;) I'd like to post those too but I wanted
to get the old stuff out of the way first...

-- 
Ville Syrjälä
rntel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] i965: Add functions to convert float - VF.

2014-10-24 Thread Ville Syrjälä
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/10] i965: Add functions to convert float - VF.

2014-10-24 Thread Ville Syrjälä
On Thu, Oct 23, 2014 at 11:19:04PM -0700, Matt Turner wrote:
 On Thu, Oct 23, 2014 at 11:01 PM, Ville Syrjälä syrj...@sci.fi wrote:
  On Thu, Oct 23, 2014 at 04:44:03PM -0700, Matt Turner wrote:
  ---
   src/mesa/drivers/dri/i965/Makefile.sources   |  1 +
   src/mesa/drivers/dri/i965/brw_packed_float.c | 74 
  
   src/mesa/drivers/dri/i965/brw_packed_float.h | 25 ++
   3 files changed, 100 insertions(+)
   create mode 100644 src/mesa/drivers/dri/i965/brw_packed_float.c
   create mode 100644 src/mesa/drivers/dri/i965/brw_packed_float.h
 
  diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
  b/src/mesa/drivers/dri/i965/Makefile.sources
  index 9c006da..6b0f601 100644
  --- a/src/mesa/drivers/dri/i965/Makefile.sources
  +++ b/src/mesa/drivers/dri/i965/Makefile.sources
  @@ -78,6 +78,7 @@ i965_FILES = \
brw_meta_fast_clear.c \
brw_misc_state.c \
brw_object_purgeable.c \
  + brw_packed_float.c \
brw_performance_monitor.c \
brw_program.c \
brw_primitive_restart.c \
  diff --git a/src/mesa/drivers/dri/i965/brw_packed_float.c 
  b/src/mesa/drivers/dri/i965/brw_packed_float.c
  new file mode 100644
  index 000..d25e7dd
  --- /dev/null
  +++ b/src/mesa/drivers/dri/i965/brw_packed_float.c
  @@ -0,0 +1,74 @@
  +/*
  + * Copyright © 2014 Intel Corporation
  + *
  + * Permission is hereby granted, free of charge, to any person obtaining a
  + * copy of this software and associated documentation files (the 
  Software),
  + * to deal in the Software without restriction, including without 
  limitation
  + * the rights to use, copy, modify, merge, publish, distribute, 
  sublicense,
  + * and/or sell copies of the Software, and to permit persons to whom the
  + * Software is furnished to do so, subject to the following conditions:
  + *
  + * The above copyright notice and this permission notice (including the 
  next
  + * paragraph) shall be included in all copies or substantial portions of 
  the
  + * Software.
  + *
  + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, 
  EXPRESS OR
  + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
  MERCHANTABILITY,
  + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
  SHALL
  + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
  OTHER
  + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
  + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
  DEALINGS
  + * IN THE SOFTWARE.
  + */
  +#include brw_packed_float.h
  +
  +union fu {
  +   float f;
  +   unsigned u;
  +   struct {
  +  unsigned mantissa:23;
  +  unsigned exponent:8;
  +  unsigned sign:1;
  +   };
  +};
  +
  +int
  +brw_float_to_vf(float f)
  +{
  +   union fu fu = { .f = f };
  +
  +   /* ±0.0f is special cased. */
  +   if (f == 0.0f)
  +  return fu.sign  7;
  +
  +   unsigned mantissa = fu.mantissa  (23 - 4);
  +   unsigned exponent = fu.exponent - (127 - 3);
 
   0x7
 
 No, the exponent = 8 below handles that.

Oh I thought you were testing the original exponent, but you indeed
check the new one. Yeah that makes more sense.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/9] i915: Use L8A8 instead of I8 to simulate A8 on gen2

2014-08-15 Thread Ville Syrjälä
On Fri, Aug 15, 2014 at 10:52:50AM +0200, Erik Faye-Lund wrote:
 On Thu, Aug 7, 2014 at 10:31 AM,  ville.syrj...@linux.intel.com wrote:
  diff --git a/src/mesa/main/texformat.c b/src/mesa/main/texformat.c
  index c61a748..f414ea3 100644
  --- a/src/mesa/main/texformat.c
  +++ b/src/mesa/main/texformat.c
  case GL_ALPHA12:
  case GL_ALPHA16:
 RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM16);
 RETURN_IF_SUPPORTED(MESA_FORMAT_A_UNORM8);
  +  RETURN_IF_SUPPORTED(MESA_FORMAT_L8A8_UNORM);
 break;
 
 I know this isn't exactly what your patch looked to support, but
 shouldn't MESA_FORMAT_L16A16_UNORM be tried as a lossless alternative
 also?

I suppose, but I suspect you'll have a hard time finding hardware that
supports L16A16 but not A16.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] kms_flip: Improve the accuracy of out frame time calculation

2014-08-11 Thread Ville Syrjälä
Sorry, ignore these. Shell history and fast fingers are a dangerous
combination...

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/9] i915: Don't call _mesa_meta_glsl_Clear() on gen2

2014-08-07 Thread Ville Syrjälä
On Wed, Aug 06, 2014 at 01:44:45PM -0700, Eric Anholt wrote:
 ville.syrj...@linux.intel.com writes:
 
  From: Ville Syrjälä ville.syrj...@linux.intel.com
 
  Gen2 doesn't have fragmnts shaders so we shouldn't be calling
 
 spelling^

Fixed locally.

 
 Other than that, patches 4-9 are:
 
 Reviewed-by: Eric Anholt e...@anholt.net

Thanks.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Fix gen2 texblend setup

2014-06-17 Thread Ville Syrjälä
On Mon, Jun 16, 2014 at 12:22:43PM -0700, Ian Romanick wrote:
 Yeah, that looks right.
 
 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 
 Did you notice if other drivers have the same off-by-one issue?

I didn't spot similar problems elsewhere.

 
 On 06/16/2014 11:22 AM, ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
  
  Fix an off by one in the texture unit walk during texblend
  setup on gen2. This caused the last enabled texunit to be
  skipped resulting in totally messed up texturing.
  
  This is a regression introduced here:
   commit 1ad443ecdd694dd9bf3c4a5050d749fb80db6fa2
   Author: Eric Anholt e...@anholt.net
   Date:   Wed Apr 23 15:35:27 2014 -0700
  
  i915: Redo texture unit walking on i830.
  
  Cc: 10.2 mesa-sta...@lists.freedesktop.org
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
   src/mesa/drivers/dri/i915/i830_texblend.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/src/mesa/drivers/dri/i915/i830_texblend.c 
  b/src/mesa/drivers/dri/i915/i830_texblend.c
  index 6e991c4..236be59 100644
  --- a/src/mesa/drivers/dri/i915/i830_texblend.c
  +++ b/src/mesa/drivers/dri/i915/i830_texblend.c
  @@ -445,7 +445,7 @@ i830EmitTextureBlend(struct i830_context *i830)
  I830_ACTIVESTATE(i830, I830_UPLOAD_TEXBLEND_ALL, false);
   
  if (ctx-Texture._MaxEnabledTexImageUnit != -1) {
  -  for (unit = 0; unit  ctx-Texture._MaxEnabledTexImageUnit; unit++)
  +  for (unit = 0; unit = ctx-Texture._MaxEnabledTexImageUnit; unit++)
if (ctx-Texture.Unit[unit]._Current)
   emit_texblend(i830, unit, blendunit++,
 unit == ctx-Texture._MaxEnabledTexImageUnit);
  

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Fix vertex color in _swsetup_Translate()

2014-04-23 Thread Ville Syrjälä
On Wed, Apr 23, 2014 at 08:40:22AM -0600, Brian Paul wrote:
 On 04/23/2014 08:18 AM, ville.syrj...@linux.intel.com wrote:
  From: nick n...@fuel7.com
 
  Straightforward fix to properly load dest-color with color data, as
  opposed to position data as previously implemented.
 
  [vsyrjala: I noticed the patch languishing in bugzilla.
It looks correct to me so I refreshed it for master.
 
Looks like it's a regression introduced in:
 commit a164d3aee063580503e5e9a77980059d52c486d7
 Author: Brian brian.p...@tungstengraphics.com
 Date:   Fri Jul 6 16:51:19 2007 -0600
 
   In _swsetup_Translate(), update dest-attrib[FRAG_ATTRIB_COL0].]
 
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499
  Cc: Brian Paul bri...@vmware.com
  Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
  ---
src/mesa/swrast_setup/ss_context.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
 
  diff --git a/src/mesa/swrast_setup/ss_context.c 
  b/src/mesa/swrast_setup/ss_context.c
  index 12a4735..237f74c 100644
  --- a/src/mesa/swrast_setup/ss_context.c
  +++ b/src/mesa/swrast_setup/ss_context.c
  @@ -287,7 +287,8 @@ _swsetup_Translate( struct gl_context *ctx, const void 
  *vertex, SWvertex *dest )
 
   _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR0,
  dest-attrib[VARYING_SLOT_COL0] );
  -   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest-color, tmp );
  +   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest-color,
  + dest-attrib[VARYING_SLOT_COL0] );
 
   _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR1,
  dest-attrib[VARYING_SLOT_COL1]);
 
 
 Reviewed-by: Brian Paul bri...@vmware.com
 
 Do you need someone to commit this for you?

Yes please.

 
 Tag for stable branches?

Yeah that would seem appropriate. The offending commit has been there
since 7.2, but I guess 10.1 is the only still active stable branch?

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888

2013-11-25 Thread Ville Syrjälä
On Mon, Nov 25, 2013 at 09:57:23AM +0100, Daniel Vetter wrote:
 On Fri, Nov 22, 2013 at 02:12:13PM -0800, Kristian Høgsberg wrote:
  I don't know what else you'd propose?  Pass an X visual in the ioctl?
  An EGL config?  This is our name space, we can add stuff as we need
  (as Keith is doing here). include/uapi/drm/drm_fourcc.h is the
  canonical source for these values and we should add
  DRM_FORMAT_SARGB there to make sure we don't clash.
 
 Well that's kinda the problem. If you don't expect the kernel to clash
 with whatever mesa is using internally then we should add it to the
 kernel, first. That's kinda what Dave's recent rant has all been about.
 
 The other issue was that originally the idea behind fourcc was to have one
 formate namespace shared between drm, v4l and whomever else cares. If
 people are happy to drop that idea on the floor I won't shed a single
 tear.

I broke that idea alredy when I cooked up the current drm fourccs.
I didn't cross check them with any other fourcc source, so I'm 100%
sure most of them don't match anything else.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888

2013-11-22 Thread Ville Syrjälä
On Fri, Nov 22, 2013 at 05:17:37PM +0100, Daniel Vetter wrote:
 On Fri, Nov 22, 2013 at 12:01 PM, Keith Packard kei...@keithp.com wrote:
  Daniel Vetter dan...@ffwll.ch writes:
 
  Hm, where do we have the canonical source for all these fourcc codes? I'm
  asking since we have our own copy in the kernel as drm_fourcc.h, and that
  one is part of the userspace ABI since we use it to pass around
  framebuffer formats and format lists.
 
  I think it's the kernel? I really don't know, as the whole notion of
  fourcc codes seems crazy to me...
 
  Feel free to steal this code and stick it in the kernel if you like.
 
 Well, I wasn't ever in favour of using fourcc codes since they're just
 not standardized at all, highly redundant in some cases and also miss
 lots of stuff we actually need (like all the rgb formats).

I also argued against them, but some people wanted them for whatever
reason. And since I didn't want to argue for several years about the
subject, I just gave in and made the drm pixel formats fourcss. But
given that I just pulled the fourccs out of my ass, we can't really
cross use them between different subsystems anyway. So if we just
consider all the different fourcc namespaces totally distinct, we're
not going to have any problems.

Personally I can promise that I will _not_ be checking Mesa/whatever
code for conflicting fourccs when I need to add a new one to drm_fourcc.h.
There, now I've given fair warning and if things explode later it won't be
my fault.

However if someone wants to emulate the drm fourcc style for whatever
reason, there is a distinct pattern how I cooked them up. Well, a few
different patterns depending whether it's RGB,YUV,packed,planar etc.

 
 Cc'ing the heck out of this to get kernel people to hopefully notice.
 Maybe someone takes charge of this ... Otherwise meh.
 
  Just afraid to create long-term maintainance madness here with the
  kernel's iron thou-shalt-not-break-userspace-ever rule ... Not likely
  we'll ever accept srgb for framebuffers though.
 
  Would suck to collide with something we do want though.
 
 Yeah, it'd suck. But given how fourcc works we probably have that
 already, just haven't noticed yet :(
 -Daniel
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
 ___
 dri-devel mailing list
 dri-de...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888

2013-11-22 Thread Ville Syrjälä
On Fri, Nov 22, 2013 at 02:12:13PM -0800, Kristian Høgsberg wrote:
 On Fri, Nov 22, 2013 at 05:17:37PM +0100, Daniel Vetter wrote:
  On Fri, Nov 22, 2013 at 12:01 PM, Keith Packard kei...@keithp.com wrote:
   Daniel Vetter dan...@ffwll.ch writes:
  
   Hm, where do we have the canonical source for all these fourcc codes? I'm
   asking since we have our own copy in the kernel as drm_fourcc.h, and that
   one is part of the userspace ABI since we use it to pass around
   framebuffer formats and format lists.
  
   I think it's the kernel? I really don't know, as the whole notion of
   fourcc codes seems crazy to me...
  
   Feel free to steal this code and stick it in the kernel if you like.
  
  Well, I wasn't ever in favour of using fourcc codes since they're just
  not standardized at all, highly redundant in some cases and also miss
  lots of stuff we actually need (like all the rgb formats).
 
 These drm codes are not fourcc codes in any other way than that
 they're defined by creating a 32 bit value by picking four characters.
 I don't know what PTSD triggers people have from hearing fourcc, but
 it seems severe.  Forget all that, these codes are DRM specific
 defines that are not inteded to match anything anybody else does.  It
 doesn't matter if these match of conflict with v4l, fourcc.org,
 wikipedia.org or what the amiga did.  They're just tokens that let us
 define succintly what the pixel format of a kms framebuffer is and
 tell the kernel.
 
 I don't know what else you'd propose?  Pass an X visual in the ioctl?
 An EGL config?  This is our name space, we can add stuff as we need
 (as Keith is doing here). include/uapi/drm/drm_fourcc.h is the
 canonical source for these values and we should add
 DRM_FORMAT_SARGB there to make sure we don't clash.

What is this format anyway? -ENODOCS

If its just an srgb version of ARGB, then I wouldn't really want it
in drm_fourcc.h. I expect colorspacy stuff will be handled by various
crtc/plane properties in the kernel so we don't need to encode that
stuff into the fb format.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [PATCH] dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888

2013-11-22 Thread Ville Syrjälä
On Fri, Nov 22, 2013 at 03:43:13PM -0800, Keith Packard wrote:
 Ville Syrjälä ville.syrj...@linux.intel.com writes:
 
  What is this format anyway? -ENODOCS
 
 Same as MESA_FORMAT_SARGB8 and __DRI_IMAGE_FORMAT_SARGB8 :-)
 
  If its just an srgb version of ARGB, then I wouldn't really want it
  in drm_fourcc.h. I expect colorspacy stuff will be handled by various
  crtc/plane properties in the kernel so we don't need to encode that
  stuff into the fb format.
 
 It's not any different from splitting YUV codes from RGB codes;

Not really. Saying something is YUV (or rather Y'CbCr) doesn't
actually tell you the color space. It just tells you whether the
information is encoded as R+G+B or Y+Cb+Cr. How you convert between
them is another matter. You need to know the gamma, color primaries,
chroma siting for sub-sampled YCbCr formats, etc.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] i915: crash in test, only OpenGL 1.4, on Windows OpenGL 2

2013-09-05 Thread Ville Syrjälä
On Thu, Sep 05, 2013 at 03:03:33AM +0200, Ondrej Riha wrote:
 Hello,
 
 I have i915: Atom N455 and have problem with OpenGL ES 2.0 test. On Windows I 
 have OpenGL 2, but on Linux I have only OpenGL 1.4 and therefore OpenGL ES 
 2.0 test crashed. For more info see:
 
 https://bugs.launchpad.net/glmark2/+bug/1220783

You've managed to mail the entirely wrong person for this question. I'm Cc:ing
the Mesa list as they should be able to give you an accurate answer.

But I do believe there was a relatively recent change to make the i915
Mesa driver always advertise OpenGL 2.1 support for gen3 hardware. So maybe
Mesa 9.2ish or so?

I think your hardware is a pineview (can't really check w/o the pci id
though), and that should be gen3 AFAIK.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-09-03 Thread Ville Syrjälä
On Thu, Aug 15, 2013 at 10:39:31PM +0200, Vedran Rodic wrote:
  We do have the set_caching ioctl. It's enough to flip the PTEs to UC and
  let MOCS manage things. I actually did a few experiments on my IVB. I
  made all Mesa's buffers UC via PTEs by patching libdrm to change the
  cache mode of each bo after allocation. Then I fiddled with the MOCS
  LLC bits in various ways. It definitely has an effect, sometimes making
  things slower, sometimes faster. xonotic again seemed to benefit. IIRC
  leaving everything LLC uncached was actually the fastest (w/ high quality
  at least) so we may be thrashing the LLC a bit there. But eg. reaction
  quake regressed quite a lot if most things were left as UC.
 
 Can you share the libdrm patch?

Sorry, forgot to reply.

Here's the patch if you're still interested.

From 47f51b19137603dccaa4fcb2a703d56335c292fe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= ville.syrj...@linux.intel.com
Date: Wed, 14 Aug 2013 15:12:29 +0300
Subject: [PATCH] make bos uncached in PTEs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
---
 intel/intel_bufmgr_gem.c | 60 ++--
 1 file changed, 53 insertions(+), 7 deletions(-)

diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c
index f98f7a7..32ff260 100644
--- a/intel/intel_bufmgr_gem.c
+++ b/intel/intel_bufmgr_gem.c
@@ -243,6 +243,10 @@ drm_intel_gem_bo_get_tiling(drm_intel_bo *bo, uint32_t * 
tiling_mode,
uint32_t * swizzle_mode);
 
 static int
+drm_intel_gem_bo_set_caching_internal(drm_intel_bo *bo,
+ uint32_t cache_mode);
+
+static int
 drm_intel_gem_bo_set_tiling_internal(drm_intel_bo *bo,
 uint32_t tiling_mode,
 uint32_t stride);
@@ -695,6 +699,7 @@ retry:
drm_intel_gem_bo_free(bo_gem-bo);
goto retry;
}
+
}
}
pthread_mutex_unlock(bufmgr_gem-lock);
@@ -761,9 +766,16 @@ drm_intel_gem_bo_alloc_for_render(drm_intel_bufmgr *bufmgr,
  unsigned long size,
  unsigned int alignment)
 {
-   return drm_intel_gem_bo_alloc_internal(bufmgr, name, size,
-  BO_ALLOC_FOR_RENDER,
-  I915_TILING_NONE, 0);
+   drm_intel_bo *bo;
+
+   bo = drm_intel_gem_bo_alloc_internal(bufmgr, name, size,
+BO_ALLOC_FOR_RENDER,
+I915_TILING_NONE, 0);
+
+   if (bo)
+   drm_intel_gem_bo_set_caching_internal(bo, I915_CACHEING_NONE);
+
+   return bo;
 }
 
 static drm_intel_bo *
@@ -772,8 +784,15 @@ drm_intel_gem_bo_alloc(drm_intel_bufmgr *bufmgr,
   unsigned long size,
   unsigned int alignment)
 {
-   return drm_intel_gem_bo_alloc_internal(bufmgr, name, size, 0,
-  I915_TILING_NONE, 0);
+   drm_intel_bo *bo;
+
+   bo = drm_intel_gem_bo_alloc_internal(bufmgr, name, size, 0,
+I915_TILING_NONE, 0);
+
+   if (bo)
+   drm_intel_gem_bo_set_caching_internal(bo, I915_CACHEING_CACHED);
+
+   return bo;
 }
 
 static drm_intel_bo *
@@ -784,6 +803,7 @@ drm_intel_gem_bo_alloc_tiled(drm_intel_bufmgr *bufmgr, 
const char *name,
drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *)bufmgr;
unsigned long size, stride;
uint32_t tiling;
+   drm_intel_bo *bo;
 
do {
unsigned long aligned_y, height_alignment;
@@ -824,8 +844,13 @@ drm_intel_gem_bo_alloc_tiled(drm_intel_bufmgr *bufmgr, 
const char *name,
if (tiling == I915_TILING_NONE)
stride = 0;
 
-   return drm_intel_gem_bo_alloc_internal(bufmgr, name, size, flags,
-  tiling, stride);
+   bo = drm_intel_gem_bo_alloc_internal(bufmgr, name, size, flags,
+tiling, stride);
+
+   if (bo)
+   drm_intel_gem_bo_set_caching_internal(bo, I915_CACHEING_NONE);
+
+   return bo;
 }
 
 /**
@@ -2363,6 +2388,27 @@ drm_intel_gem_bo_unpin(drm_intel_bo *bo)
 }
 
 static int
+drm_intel_gem_bo_set_caching_internal(drm_intel_bo *bo,
+ uint32_t cache_mode)
+{
+   drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *) bo-bufmgr;
+   drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *) bo;
+   struct drm_i915_gem_cacheing set_caching;
+   int ret;
+
+   memset(set_caching, 0, sizeof(set_caching));
+
+   set_caching.handle = bo_gem-gem_handle

Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-15 Thread Ville Syrjälä
On Thu, Aug 15, 2013 at 08:08:12AM -0700, Chad Versace wrote:
 On 08/14/2013 12:50 AM, Ville Syrjälä wrote:
  On Wed, Aug 14, 2013 at 10:45:23AM +0300, Ville Syrjälä wrote:
  On Tue, Aug 13, 2013 at 05:46:55PM -0700, Chad Versace wrote:
  On 08/13/2013 03:31 PM, Vedran Rodic wrote:
  On Mon, Aug 12, 2013 at 3:07 PM,  ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
 
 
  For L3 cacheability, IVB won't consult the PTE for anything that has a
  relevant MOCS field. So even if you make everything L3 cacheable through
  the PTEs, MOCS will always override it. How do i know you ask? Well,
  BSpec says so for one, and more importantly I verified this by running
  some tests [...]
 
 
 I suspected this. Thanks for sharing your experimental evidence.
 
 
  For LLC cachebility the story will be different because there IVB MOCS
  can only say LLC cacheable or consult the PTE. So to make stuff
  uncached in LLC on IVB, we'd need to issues the set_caching ioctl to
  change the PTE to uncached, and after that we could use just the MOCS
  to select the LLC caching policy. Since the set_caching ioctl only needs
  to be issued once per object (or you could use the , there
 
  Sorry hit send by accident. Was going to say we could use the new
  create2 ioctl Chris has proposed that allows you to set the cache mode
  when creating the object. So there won't be a performance hit from
  extra ioctls getting issued all the time.
 
 
 I would like such a cache-control ioctl, as long the ioctl can also
 be used to change the object's cacheing policy in addition to
 setting it at object creation. This would be
 needed when an object's usage oscillates between texture surface
 and render target.

We do have the set_caching ioctl. It's enough to flip the PTEs to UC and
let MOCS manage things. I actually did a few experiments on my IVB. I
made all Mesa's buffers UC via PTEs by patching libdrm to change the
cache mode of each bo after allocation. Then I fiddled with the MOCS
LLC bits in various ways. It definitely has an effect, sometimes making
things slower, sometimes faster. xonotic again seemed to benefit. IIRC
leaving everything LLC uncached was actually the fastest (w/ high quality
at least) so we may be thrashing the LLC a bit there. But eg. reaction
quake regressed quite a lot if most things were left as UC.

I should probably run through a few MOCS combinations and collect a bit
more data. But it's looking like some sensible heuristic has to be
involved since different benchmarks show conflicting results. Maybe
your LLC overcommit prevention approach would be the one. Are you
planning to continue with that work?

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-14 Thread Ville Syrjälä
On Tue, Aug 13, 2013 at 05:46:55PM -0700, Chad Versace wrote:
 On 08/13/2013 03:31 PM, Vedran Rodic wrote:
  On Mon, Aug 12, 2013 at 3:07 PM,  ville.syrj...@linux.intel.com wrote:
  From: Ville Syrjälä ville.syrj...@linux.intel.com
 
  IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
  so let's make use of it.
 
  According to the discussion we had on #intel-gfx a few weeks ago, on
  IVB all Mesa memory is already marked as cached in DRM allocated PTEs.
  So this should not have any effect. Or I'm misunderstanding something.
 
  As I understand, marking everything uncacheable and then marking just
  certain things cacheable could make a difference (since AFAIK, you
  can't mark select regions as uncacheable after you mark PTEs as
  cacheable on IVB).
 
  Can somebody more knowledgeable comment?
 
 On Ivybridge, the PTEs mark only contexts as LLC+L3 cacheable. Everything
 else is marked as cacheable in LLC, but not L3. So, Ville's patches will
 give a perf boost to Mesa running on any kernel that continues that cacheing
 policy.

There's a bit more to that story.

For L3 cacheability, IVB won't consult the PTE for anything that has a
relevant MOCS field. So even if you make everything L3 cacheable through
the PTEs, MOCS will always override it. How do i know you ask? Well,
BSpec says so for one, and more importantly I verified this by running
some tests on a patched kernel that makes all currently LLC cacheable
PTEs LLC+L3 cacheable. The patched kernel had similar performance
numbers as the unpatched kernel, and the MOCS patches had the same
effect on both kernels.

According to BSpec there are certain things that don't have a MOCS field,
so in theory the L3 PTE setting should have some effect, but for the
(mostly) gaming benchmarks I ran there didn't seem to be a significant
difference.

For LLC cachebility the story will be different because there IVB MOCS
can only say LLC cacheable or consult the PTE. So to make stuff
uncached in LLC on IVB, we'd need to issues the set_caching ioctl to
change the PTE to uncached, and after that we could use just the MOCS
to select the LLC caching policy. Since the set_caching ioctl only needs
to be issued once per object (or you could use the , there 

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/gen7: Set MOCS L3 cacheability for IVB/BYT

2013-08-14 Thread Ville Syrjälä
On Wed, Aug 14, 2013 at 10:45:23AM +0300, Ville Syrjälä wrote:
 On Tue, Aug 13, 2013 at 05:46:55PM -0700, Chad Versace wrote:
  On 08/13/2013 03:31 PM, Vedran Rodic wrote:
   On Mon, Aug 12, 2013 at 3:07 PM,  ville.syrj...@linux.intel.com wrote:
   From: Ville Syrjälä ville.syrj...@linux.intel.com
  
   IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
   so let's make use of it.
  
   According to the discussion we had on #intel-gfx a few weeks ago, on
   IVB all Mesa memory is already marked as cached in DRM allocated PTEs.
   So this should not have any effect. Or I'm misunderstanding something.
  
   As I understand, marking everything uncacheable and then marking just
   certain things cacheable could make a difference (since AFAIK, you
   can't mark select regions as uncacheable after you mark PTEs as
   cacheable on IVB).
  
   Can somebody more knowledgeable comment?
  
  On Ivybridge, the PTEs mark only contexts as LLC+L3 cacheable. Everything
  else is marked as cacheable in LLC, but not L3. So, Ville's patches will
  give a perf boost to Mesa running on any kernel that continues that cacheing
  policy.
 
 There's a bit more to that story.
 
 For L3 cacheability, IVB won't consult the PTE for anything that has a
 relevant MOCS field. So even if you make everything L3 cacheable through
 the PTEs, MOCS will always override it. How do i know you ask? Well,
 BSpec says so for one, and more importantly I verified this by running
 some tests on a patched kernel that makes all currently LLC cacheable
 PTEs LLC+L3 cacheable. The patched kernel had similar performance
 numbers as the unpatched kernel, and the MOCS patches had the same
 effect on both kernels.
 
 According to BSpec there are certain things that don't have a MOCS field,
 so in theory the L3 PTE setting should have some effect, but for the
 (mostly) gaming benchmarks I ran there didn't seem to be a significant
 difference.
 
 For LLC cachebility the story will be different because there IVB MOCS
 can only say LLC cacheable or consult the PTE. So to make stuff
 uncached in LLC on IVB, we'd need to issues the set_caching ioctl to
 change the PTE to uncached, and after that we could use just the MOCS
 to select the LLC caching policy. Since the set_caching ioctl only needs
 to be issued once per object (or you could use the , there 

Sorry hit send by accident. Was going to say we could use the new
create2 ioctl Chris has proposed that allows you to set the cache mode
when creating the object. So there won't be a performance hit from
extra ioctls getting issued all the time.

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Bugs in fbo-sys-blit (problems with fake front and missing invalidates)

2013-05-29 Thread Ville Syrjälä
On Wed, May 29, 2013 at 01:39:00PM -0700, Paul Berry wrote:
 I started this discussion with just some of the Intel folks, and Eric
 suggested I bring it to the mesa-dev list.
 
 The short version is: in my efforts to implement fast color clears on Intel
 hardware, I've uncovered some problems with the fbo-sys-blit piglit test,
 which I've traced to 3 bugs.  I could use some help with bug (2) below,
 which I suspect is a bug in the X server.
 
 -- Forwarded message (footnotes added) --
 
 First of all, a brief summary of what fbo-sys-blit's piglit_display() does:
 
 a. Clear the back buffer to green.
 b. Copy the back buffer to the front buffer using a blit.
 c. Clear the back buffer to red.
 d. Read the contents of the front buffer to verify that it's green.
 
 I've found three bugs:
snip

Did you try this [1] patch from Chris? I had a similar patch in a
private tree way back when, but for some reason I decided that it's not
necessary, and hence didn't send it out. I can't actually recall why I
came to that conclusion, nor can I think of a good reason now either.
But it's been quite a while since I poked at any dri2 or composite stuff
so I may be out of practice.

[1] http://patchwork.freedesktop.org/patch/12111/

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2012-01-31 Thread Ville Syrjälä
From: Ville Syrjala syrj...@sci.fi

Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
the server no longer sends invalidate events to clients, unless they
have performed a GetBuffers request since the drawable was last
invalidated.

If the drawable gets invalidated immediately after the GetBuffers
request was processed by the X server, it's possible that Xlib
will process the invalidate event while waiting for the GetBuffers
reply. So the server, thinking the client knows that the buffers
are invalid, is waiting for another GetBuffers request before
sending any more invalidate events. The client, on the other hand,
believes the buffers to be valid, and thus is expecting to receive
another invalidate event before it has to send another GetBuffers
request. The end result is that the client never again sends
a GetBuffers request.

To avoid this problem, take a snapshot of the lastStamp before
doing GetBuffers, and retry if the snapshot and the current
lastStamp no longer match after the GetBuffers reply has been
processed.

Signed-off-by: Ville Syrjälä syrj...@sci.fi
---
Sigh. So no one actually took the bait based on the original
patch submission (apart from a small nibble from Dave).
Let's try again...

Distros are shipping the broken xserver-1.11 + Mesa 7.11 combo,
so either people don't care about the breakage, or no one
resizes their windows anymore. Either way I'd like at least my
system to work correctly w/o manual patching.

 .../state_trackers/dri/common/dri_drawable.c   |   30 +++
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/gallium/state_trackers/dri/common/dri_drawable.c 
b/src/gallium/state_trackers/dri/common/dri_drawable.c
index 65b3d04..5a261dd 100644
--- a/src/gallium/state_trackers/dri/common/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/common/dri_drawable.c
@@ -53,6 +53,7 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
unsigned statt_mask, new_mask;
boolean new_stamp;
int i;
+   unsigned int lastStamp;
 
statt_mask = 0x0;
for (i = 0; i  count; i++)
@@ -66,23 +67,26 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
 * client stamp.  It has the value of the server stamp when last
 * checked.
 */
-   new_stamp = (drawable-texture_stamp != drawable-dPriv-lastStamp);
+   do {
+  lastStamp = drawable-dPriv-lastStamp;
+  new_stamp = (drawable-texture_stamp != lastStamp);
 
-   if (new_stamp || new_mask || screen-broken_invalidate) {
-  if (new_stamp  drawable-update_drawable_info)
- drawable-update_drawable_info(drawable);
+  if (new_stamp || new_mask || screen-broken_invalidate) {
+ if (new_stamp  drawable-update_drawable_info)
+drawable-update_drawable_info(drawable);
 
-  drawable-allocate_textures(drawable, statts, count);
+ drawable-allocate_textures(drawable, statts, count);
 
-  /* add existing textures */
-  for (i = 0; i  ST_ATTACHMENT_COUNT; i++) {
- if (drawable-textures[i])
-statt_mask |= (1  i);
-  }
+ /* add existing textures */
+ for (i = 0; i  ST_ATTACHMENT_COUNT; i++) {
+if (drawable-textures[i])
+   statt_mask |= (1  i);
+ }
 
-  drawable-texture_stamp = drawable-dPriv-lastStamp;
-  drawable-texture_mask = statt_mask;
-   }
+ drawable-texture_stamp = lastStamp;
+ drawable-texture_mask = statt_mask;
+  }
+   } while (lastStamp != drawable-dPriv-lastStamp);
 
if (!out)
   return TRUE;
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7.11] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2012-01-15 Thread Ville Syrjälä
On Fri, Jan 13, 2012 at 09:10:15AM +, Dave Airlie wrote:
 On Sun, Dec 18, 2011 at 4:22 PM, Ville Syrjälä syrj...@sci.fi wrote:
  Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
  the server no longer sends invalidate events to clients, unless they
  have performed a GetBuffers request since the drawable was last
  invalidated.
 
  If the drawable gets invalidated immediately after the GetBuffers
  request was processed by the X server, it's possible that Xlib
  will process the invalidate event while waiting for the GetBuffers
  reply. So the server, thinking the client knows that the buffers
  are invalid, is waiting for another GetBuffers request before
  sending any more invalidate events. The client, on the other hand,
  believes the buffers to be valid, and thus is expecting to receive
  another invalidate event before it has to send another GetBuffers
  request. The end result is that the client never again sends
  a GetBuffers request.
 
  To avoid this problem, take a snapshot of lastStamp before
  doing GetBuffers, and retry if the snapshot and the current
  lastStamp no longer match after the GetBuffers reply has been
  processed.
 
  Signed-off-by: Ville Syrjälä syrj...@sci.fi
  ---
  It looks like master should already handle this, as there's a
  retry loop inside st_framebuffer_validate(). I didn't test that
  in practice though.
 
 I'd really like to know if master can handle it before pulling a patch
 that isn't in master into 7.11.

I now managed to test master as well, and in fact it's also affected
by this bug.

After thinking about it a bit more I can see why that is.
st_framebuffer_validate() has the retry loop all right, but when it
calls dri_st_framebuffer_validate() during the retry,
dri_st_framebuffer_validate() thinks that the buffers are valid and
doesn't call allocate_textures(). Hence the retry loop actually does
nothing useful in this case.

So the 7.11 fix can be applied to master as well (the patch applies to
master cleanly), or we could go for a slightly more simple fix for
master due to the pre-existing retry loop. I'll leave that decision up
to someone else.

This is the simple approach:

diff --git a/src/gallium/state_trackers/dri/common/dri_drawable.c 
b/src/gallium/state_trackers/dri/common/dri_drawable.c
index 65b3d04..6effb11 100644
--- a/src/gallium/state_trackers/dri/common/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/common/dri_drawable.c
@@ -53,6 +53,7 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
unsigned statt_mask, new_mask;
boolean new_stamp;
int i;
+   unsigned int lastStamp;
 
statt_mask = 0x0;
for (i = 0; i  count; i++)
@@ -66,7 +67,8 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
 * client stamp.  It has the value of the server stamp when last
 * checked.
 */
-   new_stamp = (drawable-texture_stamp != drawable-dPriv-lastStamp);
+   lastStamp = drawable-dPriv-lastStamp;
+   new_stamp = (drawable-texture_stamp != lastStamp);
 
if (new_stamp || new_mask || screen-broken_invalidate) {
   if (new_stamp  drawable-update_drawable_info)
@@ -80,7 +82,7 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
 statt_mask |= (1  i);
   }
 
-  drawable-texture_stamp = drawable-dPriv-lastStamp;
+  drawable-texture_stamp = lastStamp;
   drawable-texture_mask = statt_mask;
}
 

To verify my bugfix I used xtrace like so:
xtrace -n -k | tee mesa.log | grep -B1 Invalidate | grep Get

When the bug is tripped that will print the GetBuffers reply that was
being processed when the bad InvalidateBuffers event was received.

Using that xtrace incantation and glxgears I verified that both the
original patch and the simple patch fix the bug.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7.11] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2012-01-13 Thread Ville Syrjälä
PING PING PING! No one cares about Mesa 7.11 anymore?

On Sat, Jan 07, 2012 at 12:12:24AM +0200, Ville Syrjälä wrote:
 On Sun, Dec 18, 2011 at 06:22:01PM +0200, Ville Syrjälä wrote:
  Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
  the server no longer sends invalidate events to clients, unless they
  have performed a GetBuffers request since the drawable was last
  invalidated.
  
  If the drawable gets invalidated immediately after the GetBuffers
  request was processed by the X server, it's possible that Xlib
  will process the invalidate event while waiting for the GetBuffers
  reply. So the server, thinking the client knows that the buffers
  are invalid, is waiting for another GetBuffers request before
  sending any more invalidate events. The client, on the other hand,
  believes the buffers to be valid, and thus is expecting to receive
  another invalidate event before it has to send another GetBuffers
  request. The end result is that the client never again sends
  a GetBuffers request.
  
  To avoid this problem, take a snapshot of lastStamp before
  doing GetBuffers, and retry if the snapshot and the current
  lastStamp no longer match after the GetBuffers reply has been
  processed.
  
  Signed-off-by: Ville Syrjälä syrj...@sci.fi
 
 Ping. Anyone interested in having Mesa 7.11 work with
 xserver = 1.11?
 
 -- 
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7.11] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2012-01-13 Thread Ville Syrjälä
On Fri, Jan 13, 2012 at 09:10:15AM +, Dave Airlie wrote:
 On Sun, Dec 18, 2011 at 4:22 PM, Ville Syrjälä syrj...@sci.fi wrote:
  Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
  the server no longer sends invalidate events to clients, unless they
  have performed a GetBuffers request since the drawable was last
  invalidated.
 
  If the drawable gets invalidated immediately after the GetBuffers
  request was processed by the X server, it's possible that Xlib
  will process the invalidate event while waiting for the GetBuffers
  reply. So the server, thinking the client knows that the buffers
  are invalid, is waiting for another GetBuffers request before
  sending any more invalidate events. The client, on the other hand,
  believes the buffers to be valid, and thus is expecting to receive
  another invalidate event before it has to send another GetBuffers
  request. The end result is that the client never again sends
  a GetBuffers request.
 
  To avoid this problem, take a snapshot of lastStamp before
  doing GetBuffers, and retry if the snapshot and the current
  lastStamp no longer match after the GetBuffers reply has been
  processed.
 
  Signed-off-by: Ville Syrjälä syrj...@sci.fi
  ---
  It looks like master should already handle this, as there's a
  retry loop inside st_framebuffer_validate(). I didn't test that
  in practice though.
 
 I'd really like to know if master can handle it before pulling a patch
 that isn't in master into 7.11.

Well, for someone using master, it should be easy to test. Just run
glxgears and resize the window aggressively. So far I've been too lazy
to figure out what, if anything, I need to upgrade to try master.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7.11] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2012-01-06 Thread Ville Syrjälä
On Sun, Dec 18, 2011 at 06:22:01PM +0200, Ville Syrjälä wrote:
 Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
 the server no longer sends invalidate events to clients, unless they
 have performed a GetBuffers request since the drawable was last
 invalidated.
 
 If the drawable gets invalidated immediately after the GetBuffers
 request was processed by the X server, it's possible that Xlib
 will process the invalidate event while waiting for the GetBuffers
 reply. So the server, thinking the client knows that the buffers
 are invalid, is waiting for another GetBuffers request before
 sending any more invalidate events. The client, on the other hand,
 believes the buffers to be valid, and thus is expecting to receive
 another invalidate event before it has to send another GetBuffers
 request. The end result is that the client never again sends
 a GetBuffers request.
 
 To avoid this problem, take a snapshot of lastStamp before
 doing GetBuffers, and retry if the snapshot and the current
 lastStamp no longer match after the GetBuffers reply has been
 processed.
 
 Signed-off-by: Ville Syrjälä syrj...@sci.fi

Ping. Anyone interested in having Mesa 7.11 work with
xserver = 1.11?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7.11] gallium/dri: Handle xserver that doesn't send needless DRI2 invalidate events

2011-12-18 Thread Ville Syrjälä
Ever since xserver commit 531869448d07e00ae241120b59f35709d59c,
the server no longer sends invalidate events to clients, unless they
have performed a GetBuffers request since the drawable was last
invalidated.

If the drawable gets invalidated immediately after the GetBuffers
request was processed by the X server, it's possible that Xlib
will process the invalidate event while waiting for the GetBuffers
reply. So the server, thinking the client knows that the buffers
are invalid, is waiting for another GetBuffers request before
sending any more invalidate events. The client, on the other hand,
believes the buffers to be valid, and thus is expecting to receive
another invalidate event before it has to send another GetBuffers
request. The end result is that the client never again sends
a GetBuffers request.

To avoid this problem, take a snapshot of lastStamp before
doing GetBuffers, and retry if the snapshot and the current
lastStamp no longer match after the GetBuffers reply has been
processed.

Signed-off-by: Ville Syrjälä syrj...@sci.fi
---
It looks like master should already handle this, as there's a
retry loop inside st_framebuffer_validate(). I didn't test that
in practice though.

 .../state_trackers/dri/common/dri_drawable.c   |   30 +++
 1 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/gallium/state_trackers/dri/common/dri_drawable.c 
b/src/gallium/state_trackers/dri/common/dri_drawable.c
index 28a33ac..6748fbc 100644
--- a/src/gallium/state_trackers/dri/common/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/common/dri_drawable.c
@@ -51,6 +51,7 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
unsigned statt_mask, new_mask;
boolean new_stamp;
int i;
+   unsigned int lastStamp;
 
statt_mask = 0x0;
for (i = 0; i  count; i++)
@@ -64,23 +65,26 @@ dri_st_framebuffer_validate(struct st_framebuffer_iface 
*stfbi,
 * least for DRI1.  dPriv-lastStamp is the client stamp.  It has the value
 * of the server stamp when last checked.
 */
-   new_stamp = (drawable-texture_stamp != drawable-dPriv-lastStamp);
+   do {
+  lastStamp = drawable-dPriv-lastStamp;
+  new_stamp = (drawable-texture_stamp != lastStamp);
 
-   if (new_stamp || new_mask || screen-broken_invalidate) {
-  if (new_stamp  drawable-update_drawable_info)
- drawable-update_drawable_info(drawable);
+  if (new_stamp || new_mask || screen-broken_invalidate) {
+ if (new_stamp  drawable-update_drawable_info)
+drawable-update_drawable_info(drawable);
 
-  drawable-allocate_textures(drawable, statts, count);
+ drawable-allocate_textures(drawable, statts, count);
 
-  /* add existing textures */
-  for (i = 0; i  ST_ATTACHMENT_COUNT; i++) {
- if (drawable-textures[i])
-statt_mask |= (1  i);
-  }
+ /* add existing textures */
+ for (i = 0; i  ST_ATTACHMENT_COUNT; i++) {
+if (drawable-textures[i])
+   statt_mask |= (1  i);
+ }
 
-  drawable-texture_stamp = drawable-dPriv-lastStamp;
-  drawable-texture_mask = statt_mask;
-   }
+ drawable-texture_stamp = lastStamp;
+ drawable-texture_mask = statt_mask;
+  }
+   } while (lastStamp != drawable-dPriv-lastStamp);
 
if (!out)
   return TRUE;
-- 
1.7.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev