Re: [Mesa-dev] [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
On 23.07.2014 15:42, Christian König wrote: > Am 23.07.2014 05:54, schrieb Michel Dänzer: >> On 21.07.2014 17:07, Christian König wrote: >>> Am 19.07.2014 03:15, schrieb Michel Dänzer: On 19.07.2014 00:47, Christian König wrote: > Am 18.07.2014 05:07, schrieb Michel Dänzer: [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI >>> I'm still not very keen with this change since I still don't >>> understand >>> the reason why it's faster than with GTT. Definitely needs more >>> testing >>> on a wider range of systems. >> Sure. If anyone wants to give this patch a spin and see if they can >> measure any performance difference, good or bad, that would be >> interesting. >> >>> Maybe limit it to APUs for now? >> But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an >> even >> bigger win with dedicated GPUs than with the Kaveri built-in GPU >> on my >> system. I suspect it may depend on the bandwidth available for >> PCIe vs. >> system memory though. > I've made a few tests today with the kernel part of the patches > running > Xonotic on Ultra in 1920 x 1080. > > Without any patches I get around ~47.0fps on average with my dedicated > HD7870. > > Adding only "drm/radeon: Use write-combined CPU mappings of rings and > IBs on >= SI" and that goes down to ~45.3fps. > > Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >= > SI" and the frame rate goes down to ~27.74fps. Hmm, looks like I'll need to do more benchmarking of 3D workloads as well. >> I haven't been able to consistently[0] measure any significant >> difference between all placements of the rings and IBs with Xonotic or >> Reaction Quake with my Bonaire. I'd expect Xonotic to be shader / GPU >> memory bandwidth bound rather than CS bound anyway, so a ~40% hit from >> that kernel patch alone is very surprising. Are you sure it wasn't just >> the same kind of variation as described below? > > Yes, I've measured that multiple times and the results where quite > consistent. > > But I didn't measured it on a Bonaire, where the bottleneck probably > isn't the CPU load. I measured it on a fast Pitcairn Ahem, my Bonaire is cranking out ~90fps of Xonotic Ultra at 1920x1080. :) (And AFAIK there are even faster Bonaire variants) > and there Xonotic was clearly affected by the patches. Okay, I hadn't realized we're not doing any command stream checking as of CIK, that probably explains the difference. >>> My tests clearly show that we still can use USWC for the ring buffer on >>> SI and probably earlier chips as well. >> Yeah, that might be the safest approach for now. > How about using USWC for the rings on all chips since R600 Any particular reason against doing it for older chips which support unsnooped access as well? > and for the IB only on CIK? As far as I can see that should do the trick > quite well. Yeah, sounds good. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
Am 23.07.2014 09:21, schrieb Michel Dänzer: On 23.07.2014 15:42, Christian König wrote: Am 23.07.2014 05:54, schrieb Michel Dänzer: On 21.07.2014 17:07, Christian König wrote: Am 19.07.2014 03:15, schrieb Michel Dänzer: On 19.07.2014 00:47, Christian König wrote: Am 18.07.2014 05:07, schrieb Michel Dänzer: [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI I'm still not very keen with this change since I still don't understand the reason why it's faster than with GTT. Definitely needs more testing on a wider range of systems. Sure. If anyone wants to give this patch a spin and see if they can measure any performance difference, good or bad, that would be interesting. Maybe limit it to APUs for now? But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an even bigger win with dedicated GPUs than with the Kaveri built-in GPU on my system. I suspect it may depend on the bandwidth available for PCIe vs. system memory though. I've made a few tests today with the kernel part of the patches running Xonotic on Ultra in 1920 x 1080. Without any patches I get around ~47.0fps on average with my dedicated HD7870. Adding only "drm/radeon: Use write-combined CPU mappings of rings and IBs on >= SI" and that goes down to ~45.3fps. Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >= SI" and the frame rate goes down to ~27.74fps. Hmm, looks like I'll need to do more benchmarking of 3D workloads as well. I haven't been able to consistently[0] measure any significant difference between all placements of the rings and IBs with Xonotic or Reaction Quake with my Bonaire. I'd expect Xonotic to be shader / GPU memory bandwidth bound rather than CS bound anyway, so a ~40% hit from that kernel patch alone is very surprising. Are you sure it wasn't just the same kind of variation as described below? Yes, I've measured that multiple times and the results where quite consistent. But I didn't measured it on a Bonaire, where the bottleneck probably isn't the CPU load. I measured it on a fast Pitcairn Ahem, my Bonaire is cranking out ~90fps of Xonotic Ultra at 1920x1080. :) (And AFAIK there are even faster Bonaire variants) My Bonaire only makes something around 17fps with Xonotic Ultra at 1920x1080, might be a good idea to figure out why at some point. and there Xonotic was clearly affected by the patches. Okay, I hadn't realized we're not doing any command stream checking as of CIK, that probably explains the difference. Good point, I should probably test the putting IBs in VRAM patch with my Bonaire as well. My tests clearly show that we still can use USWC for the ring buffer on SI and probably earlier chips as well. Yeah, that might be the safest approach for now. How about using USWC for the rings on all chips since R600 Any particular reason against doing it for older chips which support unsnooped access as well? Not really, I just didn't noticed that older chips can do this as well. Christian. and for the IB only on CIK? As far as I can see that should do the trick quite well. Yeah, sounds good. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
Requires Evergreen or later --- Passes ARB_texture_query_lod piglits, no other regressions, tested on radeon 6670. docs/GL3.txt | 2 +- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/r600/r600_shader.c | 13 - 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 8128692..d481148 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -119,7 +119,7 @@ GL 4.0: GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, r600, radeonsi, softpipe) GL_ARB_texture_cube_map_arrayDONE (i965, nv50, nvc0, r600, radeonsi, softpipe) GL_ARB_texture_gatherDONE (i965, nv50, nvc0, radeonsi, r600) - GL_ARB_texture_query_lod DONE (i965, nv50, nvc0, radeonsi) + GL_ARB_texture_query_lod DONE (i965, nv50, nvc0, r600, radeonsi) GL_ARB_transform_feedback2 DONE (i965, nv50, nvc0, r600, radeonsi) GL_ARB_transform_feedback3 DONE (i965, nv50, nvc0, r600, radeonsi) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 5bf9c00..7c50169 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -304,6 +304,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_CUBE_MAP_ARRAY: case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT: case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS: + case PIPE_CAP_TEXTURE_QUERY_LOD: return family >= CHIP_CEDAR ? 1 : 0; /* Unsupported features. */ @@ -314,7 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_VERTEX_COLOR_CLAMPED: case PIPE_CAP_USER_VERTEX_BUFFERS: case PIPE_CAP_TEXTURE_GATHER_SM5: - case PIPE_CAP_TEXTURE_QUERY_LOD: case PIPE_CAP_SAMPLE_SHADING: case PIPE_CAP_TEXTURE_GATHER_OFFSETS: case PIPE_CAP_DRAW_INDIRECT: diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index db928f3..499e511 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5106,13 +5106,21 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7; tex.dst_sel_z = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; + tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; + } + else if (inst->Instruction.Opcode == TGSI_OPCODE_LODQ) { + tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; + tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; + tex.dst_sel_z = 7; + tex.dst_sel_w = 7; } else { tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; tex.dst_sel_z = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7; + tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; } - tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; + if (inst->Instruction.Opcode == TGSI_OPCODE_TXQ_LZ) { tex.src_sel_x = 4; @@ -6669,6 +6677,7 @@ static struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[] = { {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_unsupported}, + {TGSI_OPCODE_LODQ, 0, FETCH_OP_GET_LOD, tgsi_unsupported}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; @@ -6864,6 +6873,7 @@ static struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] = { {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, + {TGSI_OPCODE_LODQ, 0, FETCH_OP_GET_LOD, tgsi_tex}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; @@ -7060,5 +7070,6 @@ static struct r600_shader_tgsi_instruction cm_shader_tgsi_instruction[] = { {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, + {TGSI_OPCODE_LODQ, 0, FETCH_OP_GET_LOD, tgsi_tex}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-de
[Mesa-dev] [PATCH] r600g: Add IMUL_HI/UMUL_HI support
Fixes fs-imulExtended, fs-imulExtended-only-msb, fs-umulExtended, fs-umulExtended-only-msb piglit tests. --- Tested on radeon 6670 src/gallium/drivers/r600/r600_shader.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index db928f3..6ba9c0f 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -,8 +,8 @@ static struct r600_shader_tgsi_instruction r600_shader_tgsi_instruction[] = { {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, tgsi_op2_trans}, + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, tgsi_op2_trans}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_unsupported}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; @@ -6861,8 +6861,8 @@ static struct r600_shader_tgsi_instruction eg_shader_tgsi_instruction[] = { {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, tgsi_op2_trans}, + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, tgsi_op2_trans}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; @@ -7057,8 +7057,8 @@ static struct r600_shader_tgsi_instruction cm_shader_tgsi_instruction[] = { {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, cayman_mul_int_instr}, + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, cayman_mul_int_instr}, {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, }; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: Implement gpu_shader5 integer ops
--- Together with separate MUL_HI/UMUL_HI patch this passes piglit ARB_gpu_shader5 integer tests. This patch trivially depends on r600g-Implement-GL_ARB_texture_query_lod for the TGSI_OPCODE_LODQ table entries. docs/GL3.txt | 2 +- src/gallium/drivers/r600/r600_shader.c | 190 + 2 files changed, 191 insertions(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index d481148..603413f 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -105,7 +105,7 @@ GL 4.0: - Dynamically uniform UBO array indices started (Chris) - Implicit signed -> unsigned conversionsDONE - Fused multiply-add DONE (i965, nvc0) - - Packing/bitfield/conversion functions DONE (i965, nvc0) + - Packing/bitfield/conversion functions DONE (i965, nvc0, r600) - Enhanced textureGather DONE (i965, nvc0, radeonsi) - Geometry shader instancing DONE (i965, nvc0) - Geometry shader multiple streams DONE (i965, nvc0) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 499e511..9abfee1 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -4192,6 +4192,172 @@ static int tgsi_ssg(struct r600_shader_ctx *ctx) return 0; } +static int tgsi_bfi(struct r600_shader_ctx *ctx) +{ + struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction; + struct r600_bytecode_alu alu; + int i, r, t1, t2; + + unsigned write_mask = inst->Dst[0].Register.WriteMask; + int last_inst = tgsi_last_instruction(write_mask); + + t1 = ctx->temp_reg; + + for (i = 0; i < 4; i++) { + if (!(write_mask & (1src[2], i); + + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + } + + t2 = r600_get_temp(ctx); + + for (i = 0; i < 4; i++) { + if (!(write_mask & (1 src[2], i); + + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + } + + for (i = 0; i < 4; i++) { + if (!(write_mask & (1 src[0], i); + + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + } + + return 0; +} + +static int tgsi_msb(struct r600_shader_ctx *ctx) +{ + struct tgsi_full_instruction *inst = &ctx->parse.FullToken.FullInstruction; + struct r600_bytecode_alu alu; + int i, r, t1, t2; + + unsigned write_mask = inst->Dst[0].Register.WriteMask; + int last_inst = tgsi_last_instruction(write_mask); + + assert(ctx->inst_info->op == ALU_OP1_FFBH_INT || + ctx->inst_info->op == ALU_OP1_FFBH_UINT); + + t1 = ctx->temp_reg; + + /* bit position is indexed from lsb by TGSI, and from msb by the hardware */ + for (i = 0; i < 4; i++) { + if (!(write_mask & (1 op; + alu.dst.sel = t1; + alu.dst.chan = i; + alu.dst.write = 1; + alu.last = i == last_inst; + + r600_bytecode_src(&alu.src[0], &ctx->src[0], i); + + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + } + + t2 = r600_get_temp(ctx); + + for (i = 0; i < 4; i++) { + if (!(write_mask & (1 Dst[0], i, &alu.dst); + alu.dst.chan = i; + alu.dst.write = 1; + alu.last = i == last_inst; + + alu.src[0].sel = t1; + alu.src[0].chan = i; + alu.src[1].sel = t2; + alu.src[1].chan = i; + alu.src[2].sel = t1; + alu.src[2].chan = i; + + r = r600_bytecode_add_alu(ctx->bc, &alu); + if (r) + return r; + } + + return 0; +} + static int tgsi_helper_copy(struct r600_shader_ctx *ctx, struct tgsi_full_instructio
Re: [Mesa-dev] [PATCH 3/5] gallium: add new semantics for tessellation
On Sat, Jul 19, 2014 at 4:59 PM, Ilia Mirkin wrote: > +TGSI_SEMANTIC_TESSCOORD > +""" > + > +For tessellation evaluation shaders, this semantic label indicates the > +coordinates of the vertex being processed. This is available in XYZ; W is > +undefined. > + + "This corresponds to gl_TessCoord" ? > +TGSI_SEMANTIC_TESSOUTER > +""" > + > +For tessellation evaluation/control shaders, this semantic label indicates > the > +outer tessellation levels of the patch. Isoline tessellation will only have > XY > +defined, triangle will have XYZ and quads will have XYZW defined. This > +corresponds to gl_TessLevelOuter. > + > +TGSI_SEMANTIC_TESSINNER > +""" > + > +For tessellation evaluation/control shaders, this semantic label indicates > the > +inner tessellation levels of the patch. The X value is only defined for > +triangle tessellation, while quads will have XY defined. This is entirely > +undefined for isoline tessellation. + "This corresponds to gl_TessLevelInner" ? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start
https://bugs.freedesktop.org/show_bug.cgi?id=78773 --- Comment #11 from Erik Faye-Lund --- That implicit cast was fixed in RBDOOM-3-BFG Git 79b8e04e. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: gpu_shader5 gl_SampleMaskIn support
Map TGSI_SEMANTIC_SAMPLEMASK to register/component. Enable face register when sample mask is needed by shader. Requires Evergreen/Cayman --- I think the rest of the sample related bits in gpu_shader5 are from GL_ARB_sample_shading which isn't implemented yet in r600. Passes samplemaskin-basic piglit, no regressions, on radeon 6670 docs/GL3.txt | 2 +- src/gallium/drivers/r600/evergreen_state.c | 10 ++-- src/gallium/drivers/r600/r600_shader.c | 37 ++ 3 files changed, 41 insertions(+), 8 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 8128692..53e19e0 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -109,7 +109,7 @@ GL 4.0: - Enhanced textureGather DONE (i965, nvc0, radeonsi) - Geometry shader instancing DONE (i965, nvc0) - Geometry shader multiple streams DONE (i965, nvc0) - - Enhanced per-sample shadingDONE (i965) + - Enhanced per-sample shadingDONE (i965, r600) - Interpolation functionsDONE (i965) - New overload resolution rules DONE GL_ARB_gpu_shader_fp64 started (Dave) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 8f5ba5f..839d2ae 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2843,8 +2843,14 @@ void evergreen_update_ps_state(struct pipe_context *ctx, struct r600_pipe_shader POSITION goes via GPRs from the SC so isn't counted */ if (rshader->input[i].name == TGSI_SEMANTIC_POSITION) pos_index = i; - else if (rshader->input[i].name == TGSI_SEMANTIC_FACE) - face_index = i; + else if (rshader->input[i].name == TGSI_SEMANTIC_FACE) { + if (face_index == -1) + face_index = i; + } + else if (rshader->input[i].name == TGSI_SEMANTIC_SAMPLEMASK) { + if (face_index == -1) + face_index = i; /* lives in same register, same enable bit */ + } else { ninterp++; if (rshader->input[i].interpolate == TGSI_INTERPOLATE_LINEAR) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index db928f3..c8ab4dd 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -287,7 +287,9 @@ struct r600_shader_ctx { boolean input_linear; boolean input_perspective; int num_interp_gpr; + /* evergreen/cayman also store sample mask in face register */ int face_gpr; + boolean has_samplemask; int colors_used; boolean clip_vertex_write; unsignedcv_output; @@ -498,7 +500,8 @@ static int r600_spi_sid(struct r600_shader_io * io) if (name == TGSI_SEMANTIC_POSITION || name == TGSI_SEMANTIC_PSIZE || name == TGSI_SEMANTIC_EDGEFLAG || - name == TGSI_SEMANTIC_FACE) + name == TGSI_SEMANTIC_FACE || + name == TGSI_SEMANTIC_SAMPLEMASK) index = 0; else { if (name == TGSI_SEMANTIC_GENERIC) { @@ -585,7 +588,8 @@ static int tgsi_declaration(struct r600_shader_ctx *ctx) ctx->shader->input[i].spi_sid = r600_spi_sid(&ctx->shader->input[i]); switch (ctx->shader->input[i].name) { case TGSI_SEMANTIC_FACE: - ctx->face_gpr = ctx->shader->input[i].gpr; + if (ctx->face_gpr == -1) + ctx->face_gpr = ctx->shader->input[i].gpr; break; case TGSI_SEMANTIC_COLOR: ctx->colors_used++; @@ -675,7 +679,14 @@ static int tgsi_declaration(struct r600_shader_ctx *ctx) break; case TGSI_FILE_SYSTEM_VALUE: - if (d->Semantic.Name == TGSI_SEMANTIC_INSTANCEID) { + if (d->Semantic.Name == TGSI_SEMANTIC_SAMPLEMASK) { + ctx->has_samplemask = true; + /* lives in Front Face GPR */ + if (ctx->face_gpr == -1) + ctx->face_gpr = ctx->file_offset[TGSI_FILE_SYSTEM_VALUE] + d->Range.First; + break; + } + else if (d->Semantic.Name == TGSI_SEMAN
[Mesa-dev] [PATCH v2] mesa: Fix glDrawBuffer/glDrawBuffers logic in _mesa_drawbuffer.
Piglit test 'gl30basic' fails on Debug Mesa with the assert: 'main/buffers.c:520: _mesa_drawbuffers: Assertion `__builtin_popcount(destMask[buf]) == 1' failed.'. According to spec (OpenGL 4.0 specification, pages 254-255) we have a different bits set for one buffer and for multiple buffers. For glDrawBuffer we may have up to four bits set but for glDrawBuffers we can only have one bit set. The _mesa_drawbuffers is called with ctx->Const.MaxDrawBuffers and NULL arguments when _mesa_update_framebuffer or _mesa_update_draw_buffers is called. In this case glDrawBuffers is always used if MaxDrawBuffers > 1. But glDrawBuffer has to be used instead of glDrawBuffers if only destMask[0] is set. v2 (Brian Paul): Only 0th entry requires special validation for (m == 1). Signed-off-by: Pavel Popov --- src/mesa/main/buffers.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/buffers.c b/src/mesa/main/buffers.c index b13a7af..95a8722 100644 --- a/src/mesa/main/buffers.c +++ b/src/mesa/main/buffers.c @@ -480,6 +480,7 @@ _mesa_drawbuffers(struct gl_context *ctx, GLuint n, const GLenum *buffers, struct gl_framebuffer *fb = ctx->DrawBuffer; GLbitfield mask[MAX_DRAW_BUFFERS]; GLuint buf; + GLuint m = n; if (!destMask) { /* compute destMask values now */ @@ -489,15 +490,17 @@ _mesa_drawbuffers(struct gl_context *ctx, GLuint n, const GLenum *buffers, mask[output] = draw_buffer_enum_to_bitmask(ctx, buffers[output]); ASSERT(mask[output] != BAD_MASK); mask[output] &= supportedMask; + if (mask[output] == 0) +m--; } destMask = mask; } /* -* If n==1, destMask[0] may have up to four bits set. +* If m==1, destMask[0] may have up to four bits set. * Otherwise, destMask[x] can only have one bit set. */ - if (n == 1) { + if (m == 1 && destMask[0]) { GLuint count = 0, destMask0 = destMask[0]; while (destMask0) { GLint bufIndex = ffs(destMask0) - 1; -- 1.9.1 Closed Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start
https://bugs.freedesktop.org/show_bug.cgi?id=78773 --- Comment #12 from Tapani Pälli --- (In reply to comment #11) > That implicit cast was fixed in RBDOOM-3-BFG Git 79b8e04e. cool, now it seems to initialize allright on IVB (still fails for me in loading though but I only have original pk4 files to test with, not BFG edition ones), someone with BFG files please resolve. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: Fix fallback iround handling of integral inputs
Roland Scheidegger writes: > Am 22.07.2014 12:26, schrieb Richard Sandiford: >> Roland Scheidegger writes: >>> Am 21.07.2014 17:53, schrieb Richard Sandiford: lp_build_iround has a fallback case that tries to emulate a round-to-nearest float->int conversion by adding 0.5 and using a truncating fptosi. For odd numbers in the range [2^23, 2^24], which are already integral, this has the effect of adding 1 to the integer, since the result of adding 0.5 is exactly half way between two representable values and the tie goes to even. This includes the important special case of (float)0xff, which is the maximum depth in a z24s8 format. I.e. calling iround on (float)0xff gives 0x100 rather than 0xff when the fallback is used. This patch only uses the x+0.5 trick if the ulp of the input is fractional. This still doesn't give ties-to-even semantics, but that doesn't seem to matter much in practice. Signed-off-by: Richard Sandiford --- src/gallium/auxiliary/gallivm/lp_bld_arit.c | 43 ++--- 1 file changed, 27 insertions(+), 16 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c index 9ef8628..82ddb5a 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c @@ -2181,25 +2181,36 @@ lp_build_iround(struct lp_build_context *bld, res = lp_build_round_arch(bld, a, LP_BUILD_ROUND_NEAREST); } else { - LLVMValueRef half; + struct lp_type int_type = lp_int_type(type); + LLVMTypeRef vec_type = bld->vec_type; + unsigned mantissa = lp_mantissa(type); + unsigned expbits = type.width - mantissa - 1; + unsigned bias = (1 << (expbits - 1)) - 1; + LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type, + (unsigned long long)1 << (type.width - 1)); + /* Smallest value with an integral ulp */ + LLVMValueRef abslimit = lp_build_const_int_vec(bld->gallivm, type, + (bias + mantissa) << mantissa); + LLVMValueRef half, sign, absa, fractulp; half = lp_build_const_vec(bld->gallivm, type, 0.5); - if (type.sign) { - LLVMTypeRef vec_type = bld->vec_type; - LLVMValueRef mask = lp_build_const_int_vec(bld->gallivm, type, -(unsigned long long)1 << (type.width - 1)); - LLVMValueRef sign; - - /* get sign bit */ - sign = LLVMBuildBitCast(builder, a, int_vec_type, ""); - sign = LLVMBuildAnd(builder, sign, mask, ""); - - /* sign * 0.5 */ - half = LLVMBuildBitCast(builder, half, int_vec_type, ""); - half = LLVMBuildOr(builder, sign, half, ""); - half = LLVMBuildBitCast(builder, half, vec_type, ""); - } + assert(type.sign); >>> You can't assert here. It is quite legal to have floats without sign >>> type - used to skip things like here when we know the input is positive. >>> There might not be all that many callers using such build contexts, >>> though some quick glance identified at least two >>> (lp_build_coord_repeat_npot_linear_int, >>> lp_build_clamped_float_to_unsigned_norm). Though I would have thought >>> the latter is where the bug you're seeing is coming from... >> >> Ah, sorry, I hadn't realised that sort of modification could happen. >> I just looked at the types that lp_bld_type.h created. >> >>> Also, I'm not a fan of making this more complex in general. >>> lp_build_itrunc is most often used for converting texture coords, or >> >> (lp_build_iround rather than lp_build_itrunc?) > Yes. > >> >>> other stuff like srgb conversion, and none of these callers care about >>> this, so maybe should distinguish the cases. >> >> As an extra boolean argument? Yeah, I can try that, although I'll probably >> need help deciding what the right argument is for some cases. Or were you >> thinking of something else? > Yes that's what I was thinking. Or otherwise use a different function. > Both solutions are kind of ugly though admittedly. Yeah, would prefer to try the other alternative first. + + /* get sign bit */ + sign = LLVMBuildBitCast(builder, a, int_vec_type, ""); + sign = LLVMBuildAnd(builder, sign, mask, ""); + + /* get "ulp is fractional" */ + absa = lp_build_abs(bld, a); + absa = LLVMBuildBitCast(builder, absa, int_vec_type, ""); + fractulp = lp_build_compare(bld->gallivm, int_type, PIPE_FUNC_LESS, absa, abslimit); + + /* (sign * 0.5) & intulp */ + half = LLVMBuildBitCast(builder, half, int_vec_type, ""); >
[Mesa-dev] [PATCH] r600g: Implement BPTC texture support
Signed-off-by: Glenn Kennard --- This patch depends on Ilia Mirkin's "nvc0: add BPTC format support" and Neil Robert's core BPTC support patches. src/gallium/drivers/r600/r600_state_common.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 8c37d0d..2f39df3 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -1967,6 +1967,29 @@ uint32_t r600_translate_texformat(struct pipe_screen *screen, } } + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { + if (!enable_s3tc) + goto out_unknown; + + if (rscreen->b.chip_class < EVERGREEN) + goto out_unknown; + + switch (format) { + case PIPE_FORMAT_BPTC_RGBA_UNORM: + case PIPE_FORMAT_BPTC_SRGBA_UNORM: + result = FMT_BC7; + is_srgb_valid = TRUE; + goto out_word4; + case PIPE_FORMAT_BPTC_RGB_FLOAT: + case PIPE_FORMAT_BPTC_RGB_UFLOAT: + result = FMT_BC6; + is_srgb_valid = TRUE; + goto out_word4; + default: + goto out_unknown; + } + } + if (desc->layout == UTIL_FORMAT_LAYOUT_SUBSAMPLED) { switch (format) { case PIPE_FORMAT_R8G8_B8G8_UNORM: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81500] wrong color in flightgear for the c172p if "Atmospheric light scattering" is used
https://bugs.freedesktop.org/show_bug.cgi?id=81500 Benjamin Bellec changed: What|Removed |Added CC||b.bel...@gmail.com --- Comment #27 from Benjamin Bellec --- @Barto, When you launch Flightgear in a terminal, do you have any log when enabling the "Atmospheric light scattering" option. Personally, I launch FlighGear with this command : $ /usr/bin/fgfs --fg-root=/usr/share/flightgear --fg-scenery=/usr/share/flightgear/Scenery --airport=KPAO --aircraft=c172p-canvas --enable-random-objects --enable-horizon-effect --enable-enhanced-lighting --enable-ai-models --enable-ai-traffic --disable-real-weather-fetch --enable-clouds3d --geometry=1024x768 And when I enable the problematic option, I got this log from FlighGear : FRAGMENT glCompileShader "/usr/share/flightgear/Shaders/urban-lightfield.frag" FAILED FRAGMENT Shader "/usr/share/flightgear/Shaders/urban-lightfield.frag" infolog: 0:196(22): preprocessor error: syntax error, unexpected IDENTIFIER, expecting NEWLINE 0:193(1): preprocessor error: Unterminated #if glLinkProgram "" FAILED Program "" infolog: error: linking with uncompiled shader Due you have some king of message too ? Note that I'm using FlightGear 2.10 (from Fedora 19 repo) and an Evergreen card with Mesa 10.3-devel. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings
This sounds good to me. Marek On Wed, Jul 23, 2014 at 8:32 AM, Michel Dänzer wrote: > On 18.07.2014 20:45, Marek Olšák wrote: >> If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the >> patch is okay. > > AFAICT GL_MAP_COHERENT_BIT simply means the app doesn't need to call > glMemoryBarrier() to ensure coherency between access to the mapping and > GL commands. Since we don't need to do anything for glMemoryBarrier(), > GL_MAP_COHERENT_BIT doesn't make any difference for us. > > That said, I think I need to add code in the kernel ensuring we always > flush the HDP cache before submitting a command stream to the hardware, > bump the minor version, and only use VRAM for streaming when the DRM > minor version is >= the bumped version. > > > -- > Earthling Michel Dänzer| http://www.amd.com > Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
On Wed, Jul 23, 2014 at 9:21 AM, Michel Dänzer wrote: > On 23.07.2014 15:42, Christian König wrote: >> Am 23.07.2014 05:54, schrieb Michel Dänzer: >>> On 21.07.2014 17:07, Christian König wrote: Am 19.07.2014 03:15, schrieb Michel Dänzer: > On 19.07.2014 00:47, Christian König wrote: >> Am 18.07.2014 05:07, schrieb Michel Dänzer: > [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI I'm still not very keen with this change since I still don't understand the reason why it's faster than with GTT. Definitely needs more testing on a wider range of systems. >>> Sure. If anyone wants to give this patch a spin and see if they can >>> measure any performance difference, good or bad, that would be >>> interesting. >>> Maybe limit it to APUs for now? >>> But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an >>> even >>> bigger win with dedicated GPUs than with the Kaveri built-in GPU >>> on my >>> system. I suspect it may depend on the bandwidth available for >>> PCIe vs. >>> system memory though. >> I've made a few tests today with the kernel part of the patches >> running >> Xonotic on Ultra in 1920 x 1080. >> >> Without any patches I get around ~47.0fps on average with my dedicated >> HD7870. >> >> Adding only "drm/radeon: Use write-combined CPU mappings of rings and >> IBs on >= SI" and that goes down to ~45.3fps. >> >> Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >= >> SI" and the frame rate goes down to ~27.74fps. > Hmm, looks like I'll need to do more benchmarking of 3D workloads as > well. >>> I haven't been able to consistently[0] measure any significant >>> difference between all placements of the rings and IBs with Xonotic or >>> Reaction Quake with my Bonaire. I'd expect Xonotic to be shader / GPU >>> memory bandwidth bound rather than CS bound anyway, so a ~40% hit from >>> that kernel patch alone is very surprising. Are you sure it wasn't just >>> the same kind of variation as described below? >> >> Yes, I've measured that multiple times and the results where quite >> consistent. >> >> But I didn't measured it on a Bonaire, where the bottleneck probably >> isn't the CPU load. I measured it on a fast Pitcairn > > Ahem, my Bonaire is cranking out ~90fps of Xonotic Ultra at 1920x1080. > :) (And AFAIK there are even faster Bonaire variants) > >> and there Xonotic was clearly affected by the patches. > > Okay, I hadn't realized we're not doing any command stream checking as > of CIK, that probably explains the difference. I think CIK is doing CS checking for VCE, but not for graphics. SI is doing CS checking for everything. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81500] wrong color in flightgear for the c172p if "Atmospheric light scattering" is used
https://bugs.freedesktop.org/show_bug.cgi?id=81500 --- Comment #28 from Barto --- (In reply to comment #27) > @Barto, > when I enable the problematic option, I got this log from FlighGear : > FRAGMENT glCompileShader > "/usr/share/flightgear/Shaders/urban-lightfield.frag" FAILED > FRAGMENT Shader "/usr/share/flightgear/Shaders/urban-lightfield.frag" > infolog: > 0:196(22): preprocessor error: syntax error, unexpected IDENTIFIER, > expecting NEWLINE > 0:193(1): preprocessor error: Unterminated #if > > > glLinkProgram "" FAILED > Program "" infolog: > error: linking with uncompiled shader > > > Due you have some king of message too ? > > Note that I'm using FlightGear 2.10 (from Fedora 19 repo) and an Evergreen > card with Mesa 10.3-devel. I don't have this error message, but I use flightgear 3.0.0, not 2.10, you must use flightgear 3.0.0 because a lot of bugs in flightgear have been fixed between the 2.10 version and 3.0.0 version, an alternative is to run the apitrace I have provided in attachment ( just install the apitrace program, and run "apitrace replay fgfs.trace" ), tell me if you see a green C172p ( I use the "c172p" plane, not the "c172p-canvas" version ) my graphic card is an ati radeon HD4650 pcie ( RV730 chipset ) OpenGL renderer string: Gallium 0.4 on AMD RV730 OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.2.4 OpenGL core profile shading language version string: 3.30 if you don't see a green C172p in the apitrace then the bug may occur only with some mesa driver ( r600 for example ) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
On Tue, Jul 22, 2014 at 11:54 PM, Michel Dänzer wrote: > On 21.07.2014 17:07, Christian König wrote: >> Am 19.07.2014 03:15, schrieb Michel Dänzer: >>> On 19.07.2014 00:47, Christian König wrote: Am 18.07.2014 05:07, schrieb Michel Dänzer: >>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI >> I'm still not very keen with this change since I still don't >> understand >> the reason why it's faster than with GTT. Definitely needs more >> testing >> on a wider range of systems. > Sure. If anyone wants to give this patch a spin and see if they can > measure any performance difference, good or bad, that would be > interesting. > >> Maybe limit it to APUs for now? > But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an > even > bigger win with dedicated GPUs than with the Kaveri built-in GPU on my > system. I suspect it may depend on the bandwidth available for PCIe vs. > system memory though. I've made a few tests today with the kernel part of the patches running Xonotic on Ultra in 1920 x 1080. Without any patches I get around ~47.0fps on average with my dedicated HD7870. Adding only "drm/radeon: Use write-combined CPU mappings of rings and IBs on >= SI" and that goes down to ~45.3fps. Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >= SI" and the frame rate goes down to ~27.74fps. >>> Hmm, looks like I'll need to do more benchmarking of 3D workloads as >>> well. > > I haven't been able to consistently[0] measure any significant > difference between all placements of the rings and IBs with Xonotic or > Reaction Quake with my Bonaire. I'd expect Xonotic to be shader / GPU > memory bandwidth bound rather than CS bound anyway, so a ~40% hit from > that kernel patch alone is very surprising. Are you sure it wasn't just > the same kind of variation as described below? > > [0] There were slightly different results sometimes, but next time I > tried the same setup again, it was back to the same as always. So it > seemed to depend more on the particular system boot / test run / moon > phase / ... than the kernel patches themselves. > > >>> Alex, given those numbers, it's probably best if you remove the "Use >>> write-combined CPU mappings of rings and IBs on >= SI" change from your >>> tree as well for now. >> >> I wouldn't go as far as reverting the patch. It just needs a bit more >> fine tuning and that can happen in the 3.17rc cycle. > > There's no need to revert it, just drop it from the tree. I'd still > prefer that for now. > I can drop it. Maybe just map the rings WC for now? Alex > >> My tests clearly show that we still can use USWC for the ring buffer on >> SI and probably earlier chips as well. > > Yeah, that might be the safest approach for now. > > > -- > Earthling Michel Dänzer| http://www.amd.com > Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Use hardware sqrt instruction
On Fri, Jul 18, 2014 at 12:35:24PM -0400, Alex Deucher wrote: > On Fri, Jul 18, 2014 at 3:54 AM, Glenn Kennard > wrote: > > Piglit quick tests including sqrt pass, no other regressions, > > tested on radeon 6670. > > --- > > Should be slightly more precise than the invsqrt/recip/mul combination > > used previously, I reckon up to about 2 bits of mantissa, and saves > > two instructions per sqrt emitted. > > > > It would be good if someone could test this on Cayman since it uses > > a slightly different codepath. > > Reviewed-by: Alex Deucher > I've pushed this patch, thanks! -Tom > > > > src/gallium/drivers/r600/r600_pipe.c | 2 +- > > src/gallium/drivers/r600/r600_shader.c | 9 +++-- > > 2 files changed, 4 insertions(+), 7 deletions(-) > > > > diff --git a/src/gallium/drivers/r600/r600_pipe.c > > b/src/gallium/drivers/r600/r600_pipe.c > > index 5bf9c00..ee6a416 100644 > > --- a/src/gallium/drivers/r600/r600_pipe.c > > +++ b/src/gallium/drivers/r600/r600_pipe.c > > @@ -428,7 +428,7 @@ static int r600_get_shader_param(struct pipe_screen* > > pscreen, unsigned shader, e > > case PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED: > > return 1; > > case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED: > > - return 0; > > + return 1; > > case PIPE_SHADER_CAP_INDIRECT_INPUT_ADDR: > > case PIPE_SHADER_CAP_INDIRECT_OUTPUT_ADDR: > > case PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR: > > diff --git a/src/gallium/drivers/r600/r600_shader.c > > b/src/gallium/drivers/r600/r600_shader.c > > index db928f3..907547d 100644 > > --- a/src/gallium/drivers/r600/r600_shader.c > > +++ b/src/gallium/drivers/r600/r600_shader.c > > @@ -6498,8 +6498,7 @@ static struct r600_shader_tgsi_instruction > > r600_shader_tgsi_instruction[] = { > > {TGSI_OPCODE_SUB, 0, ALU_OP2_ADD, tgsi_op2}, > > {TGSI_OPCODE_LRP, 0, ALU_OP0_NOP, tgsi_lrp}, > > {TGSI_OPCODE_CND, 0, ALU_OP0_NOP, tgsi_unsupported}, > > - /* gap */ > > - {20,0, ALU_OP0_NOP, tgsi_unsupported}, > > + {TGSI_OPCODE_SQRT, 0, ALU_OP1_SQRT_IEEE, > > tgsi_trans_srcx_replicate}, > > {TGSI_OPCODE_DP2A, 0, ALU_OP0_NOP, tgsi_unsupported}, > > /* gap */ > > {22,0, ALU_OP0_NOP, tgsi_unsupported}, > > @@ -6693,8 +6692,7 @@ static struct r600_shader_tgsi_instruction > > eg_shader_tgsi_instruction[] = { > > {TGSI_OPCODE_SUB, 0, ALU_OP2_ADD, tgsi_op2}, > > {TGSI_OPCODE_LRP, 0, ALU_OP0_NOP, tgsi_lrp}, > > {TGSI_OPCODE_CND, 0, ALU_OP0_NOP, tgsi_unsupported}, > > - /* gap */ > > - {20,0, ALU_OP0_NOP, tgsi_unsupported}, > > + {TGSI_OPCODE_SQRT, 0, ALU_OP1_SQRT_IEEE, > > tgsi_trans_srcx_replicate}, > > {TGSI_OPCODE_DP2A, 0, ALU_OP0_NOP, tgsi_unsupported}, > > /* gap */ > > {22,0, ALU_OP0_NOP, tgsi_unsupported}, > > @@ -6888,8 +6886,7 @@ static struct r600_shader_tgsi_instruction > > cm_shader_tgsi_instruction[] = { > > {TGSI_OPCODE_SUB, 0, ALU_OP2_ADD, tgsi_op2}, > > {TGSI_OPCODE_LRP, 0, ALU_OP0_NOP, tgsi_lrp}, > > {TGSI_OPCODE_CND, 0, ALU_OP0_NOP, tgsi_unsupported}, > > - /* gap */ > > - {20,0, ALU_OP0_NOP, tgsi_unsupported}, > > + {TGSI_OPCODE_SQRT, 0, ALU_OP1_SQRT_IEEE, > > cayman_emit_float_instr}, > > {TGSI_OPCODE_DP2A, 0, ALU_OP0_NOP, tgsi_unsupported}, > > /* gap */ > > {22,0, ALU_OP0_NOP, tgsi_unsupported}, > > -- > > 1.8.3.2 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] [RFC] r600g/compute: Adding support for defragmenting compute_memory_pool
On Fri, Jul 18, 2014 at 01:09:03PM +0200, Bruno Jimenez wrote: > On Thu, 2014-07-17 at 22:56 -0400, Tom Stellard wrote: > > On Wed, Jul 16, 2014 at 11:12:42PM +0200, Bruno Jiménez wrote: > > > Hi, > > > > > > This series finally adds support for defragmenting the pool for > > > OpenCL buffers in the r600g driver. It is mostly a rewritten of > > > the series that I wrote some months ago. > > > > > > For defragmenting the pool I have thought of two different > > > possibilities: > > > > > > - Creating a new pool and moving every item here in the correct > > > position. This has the advantage of being very simple to > > > implement and that it allows the pool to be grown at the > > > same time. But it has a couple of problems, namely that it > > > has a high memory peak usage (sum of current pool + new pool) > > > and that in the case of having a pool not very fragmented you > > > have to copy every item to its new place. > > > - Using the same pool by moving the items in it. This has the > > > advantage of using less memory (sum of current pool + biggest > > > item in it) and that it is easier to handle the case of > > > only having few elements out of place. The disadvantages > > > are that it doesn't allow growing the pool at the same time > > > and that it may involve twice the number of item-copies in > > > the worst case. > > > > > > I have chosen to implement the second option, but if you think > > > that it is better the first one I can rewrite the series for it. > > > (^_^) > > > > > > The worst case I have mentioned is this: Imagine that you have > > > a series of items in which the first is, at least, 1 'unit' > > > smaller than the rest. You now free this item and create a new > > > one with the same size [why would anyone do this? I don't know] > > > For now, the defragmenter code is so dumb that it will move > > > every item to the front of the pool without trying first to > > > put this new item in the available space. > > > > > > Hopefully situations like this won't be very common. > > > > > > If you want me to explain any detail about any of the patches > > > just ask. And as said, if you prefer the first version of the > > > defragmenter, just ask. [In fact, after having written this, > > > I may add it for the case grow+defrag] > > > > > > Also, no regressions found in piglit. > > > > > > Thanks in advance! > > > Bruno > > > > > > Bruno Jiménez (5): > > > r600g/compute: Add a function for moving items in the pool > > > r600g/compute: Add a function for defragmenting the pool > > > r600g/compute: Defrag the pool if it's necesary > > > r600g/compute: Quick exit if there's nothing to add to the pool > > > r600g/compute: Remove unneeded code from compute_memory_promote_item > > > > > > src/gallium/drivers/r600/compute_memory_pool.c | 196 > > > ++--- > > > src/gallium/drivers/r600/compute_memory_pool.h | 13 +- > > > 2 files changed, 156 insertions(+), 53 deletions(-) > > > > Hi, > > > > A took a brief look at these patches and they look pretty good. I will > > look at them again tomorrow and then commit if I don't see any issues. > I've pushed these patches, thanks! -Tom > Hi, > > Thanks, if you have any doubt about any of the patches just ask. > > I have just ended writing a follow up series for doing grow + defrag at > the same time. I still have to test it, but if no problems arise I'll > send it to the list as soon as possible. > > This new series is based on the patch that I sent here: > http://lists.freedesktop.org/archives/mesa-dev/2014-July/062923.html > If you think it's good, could you push it to master? > > Thanks in advance! > Bruno > > > -Tom > > > > > > > > -- > > > 2.0.1 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeon/llvm: enable unsafe math for graphics shaders
On Tue, Jul 22, 2014 at 12:36:33AM +0200, Grigori Goronzy wrote: > On 17.07.2014 21:24, Tom Stellard wrote: > > On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote: > >> Accuracy of some operations was recently improved in the R600 backend, > >> at the cost of slower code. This is required for compute shaders, > >> but not for graphics shaders. Add unsafe-fp-math hint to make LLVM > >> generate faster but possibly less accurate code. > >> > >> Piglit didn't indicate any regressions. > > > > Both patches are: > > Reviewed-by: Tom Stellard > > > > Can you please commit the patches for me? My account request is still > pending. > I just pushed these, thanks! -Tom > Grigori > > >> --- > >> src/gallium/drivers/radeon/radeon_llvm_emit.c | 5 + > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c > >> b/src/gallium/drivers/radeon/radeon_llvm_emit.c > >> index 1b17dd4..171ccaa 100644 > >> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c > >> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c > >> @@ -26,6 +26,7 @@ > >> #include "radeon_llvm_emit.h" > >> #include "radeon_elf_util.h" > >> #include "util/u_memory.h" > >> +#include "pipe/p_shader_tokens.h" > >> > >> #include > >> #include > >> @@ -50,6 +51,10 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned > >> type) > >>sprintf(Str, "%1d", type); > >> > >>LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str); > >> + > >> + if (type != TGSI_PROCESSOR_COMPUTE) { > >> +LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true"); > >> + } > >> } > >> > >> static void init_r600_target() { > >> -- > >> 1.8.3.2 > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] r600g/compute: Allow compute_memory_move_item to move items between resources
On Sat, Jul 19, 2014 at 07:35:49PM +0200, Bruno Jiménez wrote: > --- > src/gallium/drivers/r600/compute_memory_pool.c | 43 > ++ > src/gallium/drivers/r600/compute_memory_pool.h | 1 + > 2 files changed, 25 insertions(+), 19 deletions(-) > > diff --git a/src/gallium/drivers/r600/compute_memory_pool.c > b/src/gallium/drivers/r600/compute_memory_pool.c > index 254c1d7..1ad77ad 100644 > --- a/src/gallium/drivers/r600/compute_memory_pool.c > +++ b/src/gallium/drivers/r600/compute_memory_pool.c > @@ -331,6 +331,7 @@ void compute_memory_defrag(struct compute_memory_pool > *pool, > struct pipe_context *pipe) > { > struct compute_memory_item *item; > + struct pipe_resource *src = (struct pipe_resource *)pool->bo; > int64_t last_pos; > > COMPUTE_DBG(pool->screen, "* compute_memory_defrag()\n"); > @@ -340,7 +341,8 @@ void compute_memory_defrag(struct compute_memory_pool > *pool, > if (item->start_in_dw != last_pos) { > assert(last_pos < item->start_in_dw); > > - compute_memory_move_item(pool, item, last_pos, pipe); > + compute_memory_move_item(pool, src, src, > + item, last_pos, pipe); > } > > last_pos += align(item->size_in_dw, ITEM_ALIGNMENT); > @@ -431,7 +433,8 @@ void compute_memory_demote_item(struct > compute_memory_pool *pool, > } > > /** > - * Moves the item \a item forward in the pool to \a new_start_in_dw > + * Moves the item \a item forward from the resource \a src to the > + * resource \a dst at \a new_start_in_dw > * > * This function assumes two things: > * 1) The item is \b only moved forward > @@ -442,13 +445,14 @@ void compute_memory_demote_item(struct > compute_memory_pool *pool, > * \see compute_memory_defrag > */ > void compute_memory_move_item(struct compute_memory_pool *pool, > + struct pipe_resource *src, struct pipe_resource *dst, > struct compute_memory_item *item, uint64_t new_start_in_dw, > struct pipe_context *pipe) > { > struct pipe_screen *screen = (struct pipe_screen *)pool->screen; > struct r600_context *rctx = (struct r600_context *)pipe; > - struct pipe_resource *src = (struct pipe_resource *)pool->bo; > - struct pipe_resource *dst; > + struct pipe_resource *src_ = src; > + struct pipe_resource *dst_; I think it is confusing to have variables named _src and src. Could you rename one of them to something more descriptive. > struct pipe_box box; > > struct compute_memory_item *prev; > @@ -465,34 +469,35 @@ void compute_memory_move_item(struct > compute_memory_pool *pool, > > u_box_1d(item->start_in_dw * 4, item->size_in_dw * 4, &box); > > - /* If the ranges don't overlap, we can just copy the item directly */ > - if (new_start_in_dw + item->size_in_dw <= item->start_in_dw) { > - dst = (struct pipe_resource *)pool->bo; > + /* If the ranges don't overlap, or we are copying from one resource > + * to another, we can just copy the item directly */ > + if (src != dst || new_start_in_dw + item->size_in_dw <= > item->start_in_dw) { > + dst_ = dst; > > rctx->b.b.resource_copy_region(pipe, > - dst, 0, new_start_in_dw * 4, 0, 0, > - src, 0, &box); > + dst_, 0, new_start_in_dw * 4, 0, 0, > + src_, 0, &box); > } else { > /* The ranges overlap, we will try first to use an intermediate >* resource to move the item */ > - dst = (struct pipe_resource *)r600_compute_buffer_alloc_vram( > + dst_ = (struct pipe_resource *)r600_compute_buffer_alloc_vram( > pool->screen, item->size_in_dw * 4); > > - if (dst != NULL) { > + if (dst_ != NULL) { > rctx->b.b.resource_copy_region(pipe, > - dst, 0, 0, 0, 0, > - src, 0, &box); > + dst_, 0, 0, 0, 0, > + src_, 0, &box); > > - src = dst; > - dst = (struct pipe_resource *)pool->bo; > + src_ = dst_; > + dst_ = dst; > > box.x = 0; > > rctx->b.b.resource_copy_region(pipe, > - dst, 0, new_start_in_dw * 4, 0, 0, > - src, 0, &box); > + dst_, 0, new_start_in_dw * 4, 0, 0, > + src_, 0, &box); > > - pool->screen->b.b.resource_destroy(screen, src); > + pool->screen->b.b.resource_destroy(screen, src_); > > } else { > /* The allocation of the temporary resource failed, > @@ -505,7 +510,7 @
Re: [Mesa-dev] [PATCH 2/3] r600g/compute: Allow compute_memory_defrag to defragment between resources
On Sat, Jul 19, 2014 at 07:35:50PM +0200, Bruno Jiménez wrote: > This will be used in the following patch to avoid duplicated code > --- Reviewed-by: Tom Stellard > src/gallium/drivers/r600/compute_memory_pool.c | 11 ++- > src/gallium/drivers/r600/compute_memory_pool.h | 1 + > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/src/gallium/drivers/r600/compute_memory_pool.c > b/src/gallium/drivers/r600/compute_memory_pool.c > index 1ad77ad..ca36240 100644 > --- a/src/gallium/drivers/r600/compute_memory_pool.c > +++ b/src/gallium/drivers/r600/compute_memory_pool.c > @@ -293,7 +293,8 @@ int compute_memory_finalize_pending(struct > compute_memory_pool* pool, > } > > if (pool->status & POOL_FRAGMENTED) { > - compute_memory_defrag(pool, pipe); > + struct pipe_resource *src = (struct pipe_resource *)pool->bo; > + compute_memory_defrag(pool, src, src, pipe); > } > > if (pool->size_in_dw < allocated + unallocated) { > @@ -328,20 +329,20 @@ int compute_memory_finalize_pending(struct > compute_memory_pool* pool, > * \param pool The pool to be defragmented > */ > void compute_memory_defrag(struct compute_memory_pool *pool, > + struct pipe_resource *src, struct pipe_resource *dst, > struct pipe_context *pipe) > { > struct compute_memory_item *item; > - struct pipe_resource *src = (struct pipe_resource *)pool->bo; > int64_t last_pos; > > COMPUTE_DBG(pool->screen, "* compute_memory_defrag()\n"); > > last_pos = 0; > LIST_FOR_EACH_ENTRY(item, pool->item_list, link) { > - if (item->start_in_dw != last_pos) { > - assert(last_pos < item->start_in_dw); > + if (src != dst || item->start_in_dw != last_pos) { > + assert(last_pos <= item->start_in_dw); > > - compute_memory_move_item(pool, src, src, > + compute_memory_move_item(pool, src, dst, > item, last_pos, pipe); > } > > diff --git a/src/gallium/drivers/r600/compute_memory_pool.h > b/src/gallium/drivers/r600/compute_memory_pool.h > index 822bfbe..5f1d72b 100644 > --- a/src/gallium/drivers/r600/compute_memory_pool.h > +++ b/src/gallium/drivers/r600/compute_memory_pool.h > @@ -91,6 +91,7 @@ int compute_memory_finalize_pending(struct > compute_memory_pool* pool, > struct pipe_context * pipe); > > void compute_memory_defrag(struct compute_memory_pool *pool, > + struct pipe_resource *src, struct pipe_resource *dst, > struct pipe_context *pipe); > > int compute_memory_promote_item(struct compute_memory_pool *pool, > -- > 2.0.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] r600g/compute: Defrag the pool at the same time as we grow it
On Sat, Jul 19, 2014 at 07:35:51PM +0200, Bruno Jiménez wrote: > This allows us two things: we now need less item copies when we have > to defrag+grow the pool (to just one copy per item) and, even in the > case where we don't need to defrag the pool, we reduce the data copied > to just the useful data that the items use. > > Note: The fallback path is a bit ugly now, but hopefully we won't need > it much. Reviewed-by: Tom Stellard > --- > src/gallium/drivers/r600/compute_memory_pool.c | 40 > -- > src/gallium/drivers/r600/compute_memory_pool.h | 2 +- > 2 files changed, 19 insertions(+), 23 deletions(-) > > diff --git a/src/gallium/drivers/r600/compute_memory_pool.c > b/src/gallium/drivers/r600/compute_memory_pool.c > index ca36240..32f5892 100644 > --- a/src/gallium/drivers/r600/compute_memory_pool.c > +++ b/src/gallium/drivers/r600/compute_memory_pool.c > @@ -169,10 +169,12 @@ struct list_head *compute_memory_postalloc_chunk( > * Reallocates pool, conserves data. > * @returns -1 if it fails, 0 otherwise > */ > -int compute_memory_grow_pool(struct compute_memory_pool* pool, > - struct pipe_context * pipe, int new_size_in_dw) > +int compute_memory_grow_defrag_pool(struct compute_memory_pool *pool, > + struct pipe_context *pipe, int new_size_in_dw) > { > - COMPUTE_DBG(pool->screen, "* compute_memory_grow_pool() " > + new_size_in_dw = align(new_size_in_dw, ITEM_ALIGNMENT); > + > + COMPUTE_DBG(pool->screen, "* compute_memory_grow_defrag_pool() " > "new_size_in_dw = %d (%d bytes)\n", > new_size_in_dw, new_size_in_dw * 4); > > @@ -183,27 +185,17 @@ int compute_memory_grow_pool(struct > compute_memory_pool* pool, > } else { > struct r600_resource *temp = NULL; > > - new_size_in_dw = align(new_size_in_dw, ITEM_ALIGNMENT); > - > - COMPUTE_DBG(pool->screen, " Aligned size = %d (%d bytes)\n", > - new_size_in_dw, new_size_in_dw * 4); > - > temp = (struct r600_resource *)r600_compute_buffer_alloc_vram( > pool->screen, > new_size_in_dw * 4); > > if (temp != NULL) { > - struct r600_context *rctx = (struct r600_context *)pipe; > struct pipe_resource *src = (struct pipe_resource > *)pool->bo; > struct pipe_resource *dst = (struct pipe_resource > *)temp; > - struct pipe_box box; > > - COMPUTE_DBG(pool->screen, " Growing the pool using a > temporary resource\n"); > + COMPUTE_DBG(pool->screen, " Growing and defragmenting > the pool " > + "using a temporary resource\n"); > > - u_box_1d(0, pool->size_in_dw * 4, &box); > - > - rctx->b.b.resource_copy_region(pipe, > - dst, 0, 0, 0 ,0, > - src, 0, &box); > + compute_memory_defrag(pool, src, dst, pipe); > > pool->screen->b.b.resource_destroy( > (struct pipe_screen *)pool->screen, > @@ -229,6 +221,11 @@ int compute_memory_grow_pool(struct compute_memory_pool* > pool, > pool->screen, > pool->size_in_dw * 4); > compute_memory_shadow(pool, pipe, 0); > + > + if (pool->status & POOL_FRAGMENTED) { > + struct pipe_resource *src = (struct > pipe_resource *)pool->bo; > + compute_memory_defrag(pool, src, src, pipe); > + } > } > } > > @@ -292,16 +289,15 @@ int compute_memory_finalize_pending(struct > compute_memory_pool* pool, > return 0; > } > > - if (pool->status & POOL_FRAGMENTED) { > - struct pipe_resource *src = (struct pipe_resource *)pool->bo; > - compute_memory_defrag(pool, src, src, pipe); > - } > - > if (pool->size_in_dw < allocated + unallocated) { > - err = compute_memory_grow_pool(pool, pipe, allocated + > unallocated); > + err = compute_memory_grow_defrag_pool(pool, pipe, allocated + > unallocated); > if (err == -1) > return -1; > } > + else if (pool->status & POOL_FRAGMENTED) { > + struct pipe_resource *src = (struct pipe_resource *)pool->bo; > + compute_memory_defrag(pool, src, src, pipe); > + } > > /* After defragmenting the pool, allocated is equal to the first > available >* position for new items in the pool */ > diff --git a/src/gallium/drivers/r600/compute_memory_pool.h > b/src/gallium/drivers/r600/compute_memory_pool.h > index 5f1d72b..c7eb237 100644 > --- a/src/gallium/drivers/r600/c
Re: [Mesa-dev] [PATCH] Add an accelerated version of F_TO_I for x86_64
On Tue, Jul 22, 2014 at 4:18 PM, Jason Ekstrand wrote: > On Mon, Jul 21, 2014 at 5:29 PM, Matt Turner wrote: >> >> On Mon, Jul 21, 2014 at 5:16 PM, Jason Ekstrand >> wrote: >> > Signed-off-by: Jason Ekstrand >> > --- >> > src/mesa/main/imports.h | 4 >> > 1 file changed, 4 insertions(+) >> > >> > diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h >> > index af780b2..5d6486b 100644 >> > --- a/src/mesa/main/imports.h >> > +++ b/src/mesa/main/imports.h >> > @@ -285,6 +285,10 @@ static inline int F_TO_I(float f) >> > int r; >> > __asm__ ("fistpl %0" : "=m" (r) : "t" (f) : "st"); >> > return r; >> > +#elif defined(USE_X86_64_ASM) && defined(__GNUC__) >> > + int r; >> > + __asm__ ("cvtss2si %1, %0" : "=r" (r) : "xm" (f)); >> >> "xm"? I think you just want "x" > > > No, this is needed because it uses an SSE register. Right... "x" is SSE register. "m" is memory. Are you meaning to specify both? >> Also "=&r" since it's written without ever being read. > > > Reading the GCC docs, I don't see how earlyclobber is appropreate. Would > you care to explain your reasoning further? It's probably fine without it. The rule of thumb I've always seen is use =& if the argument is written before it could have been read. Both of these make me think that you should just use _mm_cvtss_si32() and let the compiler sort this kind of thing out. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 80848] [dri3] Building mesa fails with dri3 enabled
https://bugs.freedesktop.org/show_bug.cgi?id=80848 --- Comment #9 from Bryan Cain --- That patch makes no difference for me. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
On Wed, Jul 23, 2014 at 4:48 AM, Glenn Kennard wrote: > Requires Evergreen or later Reviewed-by: Alex Deucher > --- > Passes ARB_texture_query_lod piglits, no other regressions, > tested on radeon 6670. > > docs/GL3.txt | 2 +- > src/gallium/drivers/r600/r600_pipe.c | 2 +- > src/gallium/drivers/r600/r600_shader.c | 13 - > 3 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/docs/GL3.txt b/docs/GL3.txt > index 8128692..d481148 100644 > --- a/docs/GL3.txt > +++ b/docs/GL3.txt > @@ -119,7 +119,7 @@ GL 4.0: >GL_ARB_texture_buffer_object_rgb32 DONE (i965, nvc0, > r600, radeonsi, softpipe) >GL_ARB_texture_cube_map_arrayDONE (i965, nv50, > nvc0, r600, radeonsi, softpipe) >GL_ARB_texture_gatherDONE (i965, nv50, > nvc0, radeonsi, r600) > - GL_ARB_texture_query_lod DONE (i965, nv50, > nvc0, radeonsi) > + GL_ARB_texture_query_lod DONE (i965, nv50, > nvc0, r600, radeonsi) >GL_ARB_transform_feedback2 DONE (i965, nv50, > nvc0, r600, radeonsi) >GL_ARB_transform_feedback3 DONE (i965, nv50, > nvc0, r600, radeonsi) > > diff --git a/src/gallium/drivers/r600/r600_pipe.c > b/src/gallium/drivers/r600/r600_pipe.c > index 5bf9c00..7c50169 100644 > --- a/src/gallium/drivers/r600/r600_pipe.c > +++ b/src/gallium/drivers/r600/r600_pipe.c > @@ -304,6 +304,7 @@ static int r600_get_param(struct pipe_screen* pscreen, > enum pipe_cap param) > case PIPE_CAP_CUBE_MAP_ARRAY: > case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT: > case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS: > + case PIPE_CAP_TEXTURE_QUERY_LOD: > return family >= CHIP_CEDAR ? 1 : 0; > > /* Unsupported features. */ > @@ -314,7 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, > enum pipe_cap param) > case PIPE_CAP_VERTEX_COLOR_CLAMPED: > case PIPE_CAP_USER_VERTEX_BUFFERS: > case PIPE_CAP_TEXTURE_GATHER_SM5: > - case PIPE_CAP_TEXTURE_QUERY_LOD: > case PIPE_CAP_SAMPLE_SHADING: > case PIPE_CAP_TEXTURE_GATHER_OFFSETS: > case PIPE_CAP_DRAW_INDIRECT: > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index db928f3..499e511 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -5106,13 +5106,21 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) > tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; > tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7; > tex.dst_sel_z = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; > + tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; > + } > + else if (inst->Instruction.Opcode == TGSI_OPCODE_LODQ) { > + tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; > + tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; > + tex.dst_sel_z = 7; > + tex.dst_sel_w = 7; > } > else { > tex.dst_sel_x = (inst->Dst[0].Register.WriteMask & 1) ? 0 : 7; > tex.dst_sel_y = (inst->Dst[0].Register.WriteMask & 2) ? 1 : 7; > tex.dst_sel_z = (inst->Dst[0].Register.WriteMask & 4) ? 2 : 7; > + tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; > } > - tex.dst_sel_w = (inst->Dst[0].Register.WriteMask & 8) ? 3 : 7; > + > > if (inst->Instruction.Opcode == TGSI_OPCODE_TXQ_LZ) { > tex.src_sel_x = 4; > @@ -6669,6 +6677,7 @@ static struct r600_shader_tgsi_instruction > r600_shader_tgsi_instruction[] = { > {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_unsupported}, > + {TGSI_OPCODE_LODQ, 0, FETCH_OP_GET_LOD, tgsi_unsupported}, > {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, > }; > > @@ -6864,6 +6873,7 @@ static struct r600_shader_tgsi_instruction > eg_shader_tgsi_instruction[] = { > {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, > + {TGSI_OPCODE_LODQ, 0, FETCH_OP_GET_LOD, tgsi_tex}, > {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, > }; > > @@ -7060,5 +7070,6 @@ static struct r600_shader_tgsi_instruction > cm_shader_tgsi_instruction[] = { > {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, > + {TGSI_OPCODE
Re: [Mesa-dev] [PATCH] r600g: Add IMUL_HI/UMUL_HI support
On Wed, Jul 23, 2014 at 5:10 AM, Glenn Kennard wrote: > Fixes fs-imulExtended, fs-imulExtended-only-msb, fs-umulExtended, > fs-umulExtended-only-msb piglit tests. > --- > Tested on radeon 6670 Reviewed-by: Alex Deucher > > src/gallium/drivers/r600/r600_shader.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index db928f3..6ba9c0f 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -,8 +,8 @@ static struct r600_shader_tgsi_instruction > r600_shader_tgsi_instruction[] = { > {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, > {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, > {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, > - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, tgsi_op2_trans}, > + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, tgsi_op2_trans}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_unsupported}, > {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, > }; > @@ -6861,8 +6861,8 @@ static struct r600_shader_tgsi_instruction > eg_shader_tgsi_instruction[] = { > {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, > {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, > {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, > - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, tgsi_op2_trans}, > + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, tgsi_op2_trans}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, > {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, > }; > @@ -7057,8 +7057,8 @@ static struct r600_shader_tgsi_instruction > cm_shader_tgsi_instruction[] = { > {TGSI_OPCODE_TEX2, 0, FETCH_OP_SAMPLE, tgsi_tex}, > {TGSI_OPCODE_TXB2, 0, FETCH_OP_SAMPLE_LB, tgsi_tex}, > {TGSI_OPCODE_TXL2, 0, FETCH_OP_SAMPLE_L, tgsi_tex}, > - {TGSI_OPCODE_IMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > - {TGSI_OPCODE_UMUL_HI, 0, ALU_OP0_NOP, tgsi_unsupported}, > + {TGSI_OPCODE_IMUL_HI, 0, ALU_OP2_MULHI_INT, cayman_mul_int_instr}, > + {TGSI_OPCODE_UMUL_HI, 0, ALU_OP2_MULHI_UINT, cayman_mul_int_instr}, > {TGSI_OPCODE_TG4, 0, FETCH_OP_GATHER4, tgsi_tex}, > {TGSI_OPCODE_LAST, 0, ALU_OP0_NOP, tgsi_unsupported}, > }; > -- > 1.8.3.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement gpu_shader5 integer ops
On Wed, Jul 23, 2014 at 5:36 AM, Glenn Kennard wrote: > --- > Together with separate MUL_HI/UMUL_HI patch this passes piglit > ARB_gpu_shader5 integer tests. > > This patch trivially depends on r600g-Implement-GL_ARB_texture_query_lod > for the TGSI_OPCODE_LODQ table entries. Reviewed-by: Alex Deucher > > docs/GL3.txt | 2 +- > src/gallium/drivers/r600/r600_shader.c | 190 > + > 2 files changed, 191 insertions(+), 1 deletion(-) > > diff --git a/docs/GL3.txt b/docs/GL3.txt > index d481148..603413f 100644 > --- a/docs/GL3.txt > +++ b/docs/GL3.txt > @@ -105,7 +105,7 @@ GL 4.0: >- Dynamically uniform UBO array indices started (Chris) >- Implicit signed -> unsigned conversionsDONE >- Fused multiply-add DONE (i965, nvc0) > - - Packing/bitfield/conversion functions DONE (i965, nvc0) > + - Packing/bitfield/conversion functions DONE (i965, nvc0, > r600) >- Enhanced textureGather DONE (i965, nvc0, > radeonsi) >- Geometry shader instancing DONE (i965, nvc0) >- Geometry shader multiple streams DONE (i965, nvc0) > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index 499e511..9abfee1 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -4192,6 +4192,172 @@ static int tgsi_ssg(struct r600_shader_ctx *ctx) > return 0; > } > > +static int tgsi_bfi(struct r600_shader_ctx *ctx) > +{ > + struct tgsi_full_instruction *inst = > &ctx->parse.FullToken.FullInstruction; > + struct r600_bytecode_alu alu; > + int i, r, t1, t2; > + > + unsigned write_mask = inst->Dst[0].Register.WriteMask; > + int last_inst = tgsi_last_instruction(write_mask); > + > + t1 = ctx->temp_reg; > + > + for (i = 0; i < 4; i++) { > + if (!(write_mask & (1< + continue; > + > + /* create mask tmp */ > + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); > + alu.op = ALU_OP2_BFM_INT; > + alu.dst.sel = t1; > + alu.dst.chan = i; > + alu.dst.write = 1; > + alu.last = i == last_inst; > + > + r600_bytecode_src(&alu.src[0], &ctx->src[3], i); > + r600_bytecode_src(&alu.src[1], &ctx->src[2], i); > + > + r = r600_bytecode_add_alu(ctx->bc, &alu); > + if (r) > + return r; > + } > + > + t2 = r600_get_temp(ctx); > + > + for (i = 0; i < 4; i++) { > + if (!(write_mask & (1< + continue; > + > + /* shift insert left */ > + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); > + alu.op = ALU_OP2_LSHL_INT; > + alu.dst.sel = t2; > + alu.dst.chan = i; > + alu.dst.write = 1; > + alu.last = i == last_inst; > + > + r600_bytecode_src(&alu.src[0], &ctx->src[1], i); > + r600_bytecode_src(&alu.src[1], &ctx->src[2], i); > + > + r = r600_bytecode_add_alu(ctx->bc, &alu); > + if (r) > + return r; > + } > + > + for (i = 0; i < 4; i++) { > + if (!(write_mask & (1< + continue; > + > + /* actual bitfield insert */ > + memset(&alu, 0, sizeof(struct r600_bytecode_alu)); > + alu.op = ALU_OP3_BFI_INT; > + alu.is_op3 = 1; > + tgsi_dst(ctx, &inst->Dst[0], i, &alu.dst); > + alu.dst.chan = i; > + alu.dst.write = 1; > + alu.last = i == last_inst; > + > + alu.src[0].sel = t1; > + alu.src[0].chan = i; > + alu.src[1].sel = t2; > + alu.src[1].chan = i; > + r600_bytecode_src(&alu.src[2], &ctx->src[0], i); > + > + r = r600_bytecode_add_alu(ctx->bc, &alu); > + if (r) > + return r; > + } > + > + return 0; > +} > + > +static int tgsi_msb(struct r600_shader_ctx *ctx) > +{ > + struct tgsi_full_instruction *inst = > &ctx->parse.FullToken.FullInstruction; > + struct r600_bytecode_alu alu; > + int i, r, t1, t2; > + > + unsigned write_mask = inst->Dst[0].Register.WriteMask; > + int last_inst = tgsi_last_instruction(write_mask); > + > + assert(ctx->inst_info->op == ALU_OP1_FFBH_INT || > + ctx->inst_info->op == ALU_OP1_FFBH_UINT); > + > + t1 = ctx->temp_reg; > + > + /* bit position is indexed from lsb by TGSI, and from msb by the > hardware */ > + for (i = 0; i < 4; i++) { > + if (!(write_mask & (1< + cont
Re: [Mesa-dev] [PATCH] r600g: gpu_shader5 gl_SampleMaskIn support
On Wed, Jul 23, 2014 at 5:57 AM, Glenn Kennard wrote: > Map TGSI_SEMANTIC_SAMPLEMASK to register/component. > Enable face register when sample mask is needed by shader. > Requires Evergreen/Cayman > --- > I think the rest of the sample related bits in gpu_shader5 are > from GL_ARB_sample_shading which isn't implemented yet in r600. > > Passes samplemaskin-basic piglit, no regressions, on radeon 6670 Reviewed-by: Alex Deucher > > docs/GL3.txt | 2 +- > src/gallium/drivers/r600/evergreen_state.c | 10 ++-- > src/gallium/drivers/r600/r600_shader.c | 37 > ++ > 3 files changed, 41 insertions(+), 8 deletions(-) > > diff --git a/docs/GL3.txt b/docs/GL3.txt > index 8128692..53e19e0 100644 > --- a/docs/GL3.txt > +++ b/docs/GL3.txt > @@ -109,7 +109,7 @@ GL 4.0: >- Enhanced textureGather DONE (i965, nvc0, > radeonsi) >- Geometry shader instancing DONE (i965, nvc0) >- Geometry shader multiple streams DONE (i965, nvc0) > - - Enhanced per-sample shadingDONE (i965) > + - Enhanced per-sample shadingDONE (i965, r600) >- Interpolation functionsDONE (i965) >- New overload resolution rules DONE >GL_ARB_gpu_shader_fp64 started (Dave) > diff --git a/src/gallium/drivers/r600/evergreen_state.c > b/src/gallium/drivers/r600/evergreen_state.c > index 8f5ba5f..839d2ae 100644 > --- a/src/gallium/drivers/r600/evergreen_state.c > +++ b/src/gallium/drivers/r600/evergreen_state.c > @@ -2843,8 +2843,14 @@ void evergreen_update_ps_state(struct pipe_context > *ctx, struct r600_pipe_shader >POSITION goes via GPRs from the SC so isn't counted */ > if (rshader->input[i].name == TGSI_SEMANTIC_POSITION) > pos_index = i; > - else if (rshader->input[i].name == TGSI_SEMANTIC_FACE) > - face_index = i; > + else if (rshader->input[i].name == TGSI_SEMANTIC_FACE) { > + if (face_index == -1) > + face_index = i; > + } > + else if (rshader->input[i].name == TGSI_SEMANTIC_SAMPLEMASK) { > + if (face_index == -1) > + face_index = i; /* lives in same register, > same enable bit */ > + } > else { > ninterp++; > if (rshader->input[i].interpolate == > TGSI_INTERPOLATE_LINEAR) > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index db928f3..c8ab4dd 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -287,7 +287,9 @@ struct r600_shader_ctx { > boolean input_linear; > boolean input_perspective; > int num_interp_gpr; > + /* evergreen/cayman also store sample mask in face register */ > int face_gpr; > + boolean has_samplemask; > int colors_used; > boolean clip_vertex_write; > unsignedcv_output; > @@ -498,7 +500,8 @@ static int r600_spi_sid(struct r600_shader_io * io) > if (name == TGSI_SEMANTIC_POSITION || > name == TGSI_SEMANTIC_PSIZE || > name == TGSI_SEMANTIC_EDGEFLAG || > - name == TGSI_SEMANTIC_FACE) > + name == TGSI_SEMANTIC_FACE || > + name == TGSI_SEMANTIC_SAMPLEMASK) > index = 0; > else { > if (name == TGSI_SEMANTIC_GENERIC) { > @@ -585,7 +588,8 @@ static int tgsi_declaration(struct r600_shader_ctx *ctx) > ctx->shader->input[i].spi_sid = > r600_spi_sid(&ctx->shader->input[i]); > switch (ctx->shader->input[i].name) { > case TGSI_SEMANTIC_FACE: > - ctx->face_gpr = ctx->shader->input[i].gpr; > + if (ctx->face_gpr == -1) > + ctx->face_gpr = > ctx->shader->input[i].gpr; > break; > case TGSI_SEMANTIC_COLOR: > ctx->colors_used++; > @@ -675,7 +679,14 @@ static int tgsi_declaration(struct r600_shader_ctx *ctx) > break; > > case TGSI_FILE_SYSTEM_VALUE: > - if (d->Semantic.Name == TGSI_SEMANTIC_INSTANCEID) { > + if (d->Semantic.Name == TGSI_SEMANTIC_SAMPLEMASK) { > + ctx->has_samplemask = true; > + /* lives in F
Re: [Mesa-dev] [PATCH] r600g: Implement BPTC texture support
On Wed, Jul 23, 2014 at 7:16 AM, Glenn Kennard wrote: > Signed-off-by: Glenn Kennard > --- > This patch depends on Ilia Mirkin's "nvc0: add BPTC format support" > and Neil Robert's core BPTC support patches. Reviewed-by: Alex Deucher > > src/gallium/drivers/r600/r600_state_common.c | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/src/gallium/drivers/r600/r600_state_common.c > b/src/gallium/drivers/r600/r600_state_common.c > index 8c37d0d..2f39df3 100644 > --- a/src/gallium/drivers/r600/r600_state_common.c > +++ b/src/gallium/drivers/r600/r600_state_common.c > @@ -1967,6 +1967,29 @@ uint32_t r600_translate_texformat(struct pipe_screen > *screen, > } > } > > + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { > + if (!enable_s3tc) > + goto out_unknown; > + > + if (rscreen->b.chip_class < EVERGREEN) > + goto out_unknown; > + > + switch (format) { > + case PIPE_FORMAT_BPTC_RGBA_UNORM: > + case PIPE_FORMAT_BPTC_SRGBA_UNORM: > + result = FMT_BC7; > + is_srgb_valid = TRUE; > + goto out_word4; > + case PIPE_FORMAT_BPTC_RGB_FLOAT: > + case PIPE_FORMAT_BPTC_RGB_UFLOAT: > + result = FMT_BC6; > + is_srgb_valid = TRUE; > + goto out_word4; > + default: > + goto out_unknown; > + } > + } > + > if (desc->layout == UTIL_FORMAT_LAYOUT_SUBSAMPLED) { > switch (format) { > case PIPE_FORMAT_R8G8_B8G8_UNORM: > -- > 1.8.3.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement BPTC texture support
On Wed, Jul 23, 2014 at 7:16 AM, Glenn Kennard wrote: > Signed-off-by: Glenn Kennard > --- > This patch depends on Ilia Mirkin's "nvc0: add BPTC format support" > and Neil Robert's core BPTC support patches. > > src/gallium/drivers/r600/r600_state_common.c | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/src/gallium/drivers/r600/r600_state_common.c > b/src/gallium/drivers/r600/r600_state_common.c > index 8c37d0d..2f39df3 100644 > --- a/src/gallium/drivers/r600/r600_state_common.c > +++ b/src/gallium/drivers/r600/r600_state_common.c > @@ -1967,6 +1967,29 @@ uint32_t r600_translate_texformat(struct pipe_screen > *screen, > } > } > > + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { > + if (!enable_s3tc) > + goto out_unknown; > + > + if (rscreen->b.chip_class < EVERGREEN) > + goto out_unknown; > + > + switch (format) { > + case PIPE_FORMAT_BPTC_RGBA_UNORM: > + case PIPE_FORMAT_BPTC_SRGBA_UNORM: > + result = FMT_BC7; > + is_srgb_valid = TRUE; > + goto out_word4; > + case PIPE_FORMAT_BPTC_RGB_FLOAT: > + case PIPE_FORMAT_BPTC_RGB_UFLOAT: > + result = FMT_BC6; > + is_srgb_valid = TRUE; [Usual disclaimer: I don't know much about radeon.] For nvc0, there's a separate texture format for FLOAT and UFLOAT. How does the hw tell which one to decode it as? [And I presume it figures out RGBA vs SRGBA by looking at the format description's colorspace field...] > + goto out_word4; > + default: > + goto out_unknown; > + } > + } > + > if (desc->layout == UTIL_FORMAT_LAYOUT_SUBSAMPLED) { > switch (format) { > case PIPE_FORMAT_R8G8_B8G8_UNORM: > -- > 1.8.3.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
On Wed, Jul 23, 2014 at 12:45 PM, Matt Turner wrote: > On Wed, Jul 23, 2014 at 1:48 AM, Glenn Kennard > wrote: >> Requires Evergreen or later >> --- >> Passes ARB_texture_query_lod piglits, no other regressions, >> tested on radeon 6670. > > Oh, good to hear. Marek had patches to loosen the tolerance on the > piglit tests for radeonsi. I presume we can conclude that if they pass > on r600g that the tests are not too strict. Or http://cgit.freedesktop.org/piglit/commit/?id=7e699cdb47f328206afa6dd454de8d6f28d7ffe9 Could be interesting to see if it passes with that commit reverted. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
On Wed, Jul 23, 2014 at 1:48 AM, Glenn Kennard wrote: > Requires Evergreen or later > --- > Passes ARB_texture_query_lod piglits, no other regressions, > tested on radeon 6670. Oh, good to hear. Marek had patches to loosen the tolerance on the piglit tests for radeonsi. I presume we can conclude that if they pass on r600g that the tests are not too strict. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
On Wed, Jul 23, 2014 at 9:49 AM, Ilia Mirkin wrote: > On Wed, Jul 23, 2014 at 12:45 PM, Matt Turner wrote: >> On Wed, Jul 23, 2014 at 1:48 AM, Glenn Kennard >> wrote: >>> Requires Evergreen or later >>> --- >>> Passes ARB_texture_query_lod piglits, no other regressions, >>> tested on radeon 6670. >> >> Oh, good to hear. Marek had patches to loosen the tolerance on the >> piglit tests for radeonsi. I presume we can conclude that if they pass >> on r600g that the tests are not too strict. > > Or > > http://cgit.freedesktop.org/piglit/commit/?id=7e699cdb47f328206afa6dd454de8d6f28d7ffe9 > > Could be interesting to see if it passes with that commit reverted. Oh, I thought I sufficiently NAK'd that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
I don't see your NAK and Ian gave me an ACK. Besides, what would the tests be good for if you couldn't do any regression testing because they would always fail due to precision issues? Marek On Wed, Jul 23, 2014 at 6:54 PM, Matt Turner wrote: > On Wed, Jul 23, 2014 at 9:49 AM, Ilia Mirkin wrote: >> On Wed, Jul 23, 2014 at 12:45 PM, Matt Turner wrote: >>> On Wed, Jul 23, 2014 at 1:48 AM, Glenn Kennard >>> wrote: Requires Evergreen or later --- Passes ARB_texture_query_lod piglits, no other regressions, tested on radeon 6670. >>> >>> Oh, good to hear. Marek had patches to loosen the tolerance on the >>> piglit tests for radeonsi. I presume we can conclude that if they pass >>> on r600g that the tests are not too strict. >> >> Or >> >> http://cgit.freedesktop.org/piglit/commit/?id=7e699cdb47f328206afa6dd454de8d6f28d7ffe9 >> >> Could be interesting to see if it passes with that commit reverted. > > Oh, I thought I sufficiently NAK'd that. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] Rework texture upload code
On Fri, Jul 18, 2014 at 5:32 PM, Brian Paul wrote: > On 07/17/2014 12:04 PM, Jason Ekstrand wrote: >> >> This is the first installment of some work I've been doing over the past >> couple of weeks to refactor mesa's texture conversion/storage code. There >> is more to be done and more that I have done but have not included in this >> series. This is the first mailing-list-ready fruits of my efforts. The >> important bits here include: >> >> 1) Using a human-readable CSV file to describe texture formats similar >> to >> the way it is currently don in gallium. This is much easier to >> read/edit than the structure in formats.c. The guts of formats.c is >> then autogenerated from this CSV file. > > > I'm kind of on the fence about this. Some of us have been hoping that we'd > eventually consolidate some of the Mesa and gallium code so we wouldn't have > duplicated/parallel code. The format code is a good example. In theory, > the MESA_FORMAT_ stuff could be replaced by the gallium PIPE_FORMAT_ code. I agree and I think this rework brings us closer to sharing the code, which is a good thing. In the meantime, it nicely cleans up the code. BTW, this series will conflict with the big endian work that Richard Sandiford is doing. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] Rework texture upload code
On Wed, Jul 23, 2014 at 10:31 AM, Marek Olšák wrote: > On Fri, Jul 18, 2014 at 5:32 PM, Brian Paul wrote: > > On 07/17/2014 12:04 PM, Jason Ekstrand wrote: > >> > >> This is the first installment of some work I've been doing over the past > >> couple of weeks to refactor mesa's texture conversion/storage code. > There > >> is more to be done and more that I have done but have not included in > this > >> series. This is the first mailing-list-ready fruits of my efforts. The > >> important bits here include: > >> > >> 1) Using a human-readable CSV file to describe texture formats similar > >> to > >> the way it is currently don in gallium. This is much easier to > >> read/edit than the structure in formats.c. The guts of formats.c > is > >> then autogenerated from this CSV file. > > > > > > I'm kind of on the fence about this. Some of us have been hoping that > we'd > > eventually consolidate some of the Mesa and gallium code so we wouldn't > have > > duplicated/parallel code. The format code is a good example. In theory, > > the MESA_FORMAT_ stuff could be replaced by the gallium PIPE_FORMAT_ > code. > > I agree and I think this rework brings us closer to sharing the code, > which is a good thing. In the meantime, it nicely cleans up the code. > In that vein, I've taken a slightly different tack that I think will make the sharing easier. Right now, I'm working on moving a bunch of code into a new static library called libformat that can be independently developed and *tested*. One of the problems we had before was that the only format-conversion testing we did was through piglit which can't actually hit all of the code-paths. Hopefully, splitting it out will make it easier to test and easier to combine efforts than gallium depending on mesa or the other way around. > > BTW, this series will conflict with the big endian work that Richard > Sandiford is doing. > Yeah, I'm aware of that. --Jason Ekstrand ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/17] i965: Rename array_spacing_lod0 to non_mip_arrays
On Tue, Jul 22, 2014 at 02:22:04PM -0700, Jordan Justen wrote: > On Tue, Jul 22, 2014 at 11:14 AM, Pohjolainen, Topi > wrote: > > On Fri, Jul 18, 2014 at 02:16:47PM -0700, Jordan Justen wrote: > >> Generalize the name array_spacing_lod0 to non_mip_arrays. Previously > >> it was only used in certain cases where only a single mip-level was > >> used. > >> > >> For gen6 we will use non-mipmapped array spacing, but with multiple > >> mip levels. This is needed because gen6 hiz and stencil only support a > >> single mip-level. > >> > >> PRM Volume 1, Part 1, 7.18.3.7.2 For separate stencil buffer [DevILK] > >> to [DevSNB]: > >> "The separate stencil buffer does not support mip mapping, thus the > >> storage for LODs other than LOD 0 is not needed." > >> > >> PRM Volume 2, Part 1, 7.5.3 Hierarchical Depth Buffer > >> "[DevSNB]: The hierarchical depth buffer does not support the LOD > >> field, it is assumed by hardware to be zero. A separate > >> hierarachical depth buffer is required for each LOD used, and the > >> corresponding buffer???s state delivered to hardware each time a new > >> depth buffer state with modified LOD is delivered." > >> > >> Signed-off-by: Jordan Justen > > > > I know I complained about the name in the first place, and I'm not too happy > > about this either. The structure is still a "mip array" as it supports > > multiple layers for multiple levels (I think the original as array of > > miptrees and this new layout as miptree of arrays). I don't have anything > > better to suggest and hence I'm fine with this or even dropping this patch. > > I don't want leave it as array_spacing_lod0, since that has a specific > hardware meaning, and we are extending the use of the field. > > What about something like this? > > enum miptree_array_layout { >/* Each array slice contains all miplevels packed together. > * > * Gen hardware usually wants miptrees > * configured this way. > */ >ALL_LOD_IN_EACH_SLICE, > >/* Each LOD contains all slices of that LOD packed together. > * > * Multisampled surfaces use this for array_spacing_lod0. > * > * Gen6 uses this for separate stencil and hiz. > */ >ALL_SLICES_AT_EACH_LOD, > }; > > So, code might look like: >if (mt->array_layout == ALL_SLICES_AT_EACH_LOD) This looks good to me! Thanks for taking time and thinking it through. > > -Jordan > > > I had some minor comments for the rest of the series and something to be > > clarified in patch 16 but 12-17 are: > > > > Reviewed-by: Topi Pohjolainen > > > >> --- > >> src/mesa/drivers/dri/i965/brw_blorp.cpp | 2 +- > >> src/mesa/drivers/dri/i965/brw_blorp.h | 2 +- > >> src/mesa/drivers/dri/i965/brw_tex_layout.c| 2 +- > >> src/mesa/drivers/dri/i965/gen7_blorp.cpp | 2 +- > >> src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 6 +++--- > >> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 +++--- > >> src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- > >> 7 files changed, 11 insertions(+), 11 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp > >> b/src/mesa/drivers/dri/i965/brw_blorp.cpp > >> index b57721c..c5ed84a 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp > >> @@ -82,7 +82,7 @@ brw_blorp_surface_info::set(struct brw_context *brw, > >> { > >> brw_blorp_mip_info::set(mt, level, layer); > >> this->num_samples = mt->num_samples; > >> - this->array_spacing_lod0 = mt->array_spacing_lod0; > >> + this->non_mip_arrays = mt->non_mip_arrays; > >> this->map_stencil_as_y_tiled = false; > >> this->msaa_layout = mt->msaa_layout; > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h > >> b/src/mesa/drivers/dri/i965/brw_blorp.h > >> index 683f09e..0b360c5 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_blorp.h > >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.h > >> @@ -153,7 +153,7 @@ public: > >> /* Setting this flag indicates that the surface should be set up in > >> * ARYSPC_LOD0 mode. Ignored prior to Gen7. > >> */ > >> - bool array_spacing_lod0; > >> + bool non_mip_arrays; > >> > >> /** > >> * Format that should be used when setting up the surface state for > >> this > >> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c > >> b/src/mesa/drivers/dri/i965/brw_tex_layout.c > >> index 76044b2..9e2720b 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c > >> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c > >> @@ -241,7 +241,7 @@ brw_miptree_layout_texture_array(struct brw_context > >> *brw, > >> > >> h0 = ALIGN(mt->physical_height0, mt->align_h); > >> h1 = ALIGN(minify(mt->physical_height0, 1), mt->align_h); > >> - if (mt->array_spacing_lod0) > >> + if (mt->non_mip_arrays) > >>mt->qpitch = h0; > >> else > >>mt->qpitch = (h0 + h1 + (brw->gen >= 7 ? 12 : 11) * mt->align_h); > >> diff --git a/src/mesa/drive
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
On Wed, Jul 23, 2014 at 10:19 AM, Marek Olšák wrote: > I don't see your NAK and Ian gave me an ACK. Besides, what would the > tests be good for if you couldn't do any regression testing because > they would always fail due to precision issues? Please don't top quote. Ian reviewed before my comments. Since I'm the author of the tests had serious doubts, I don't think his R-b is sufficient. The thread ended with you saying you didn't have any explanation for why the hardware didn't return the expected values -- that doesn't seem sufficient. The patch also seems really suspect because it modifies only some of the tests, and does stuff like loosening the tolerance of the nearest filtering tests which should only return integral values. We've had cases like this in the past that turned out to be driver bugs that we wouldn't have discovered if we'd papered over the test failures. (In particular a case where you wanted to loosen the restrictions of an MSAA test, and after Paul NAK'd it you discovered that the sample positions used by r600 were wrong, IIRC) I'm okay loosening tolerances if we can come up with reasonable explanations why, but that was definitely not the case here. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] Rework texture upload code
FYI, We thought of moving all the format code to src/util to make it independent from everything else. (yeah that directory doesn't exist yet) Marek On Wed, Jul 23, 2014 at 8:10 PM, Jason Ekstrand wrote: > > > > On Wed, Jul 23, 2014 at 10:31 AM, Marek Olšák wrote: >> >> On Fri, Jul 18, 2014 at 5:32 PM, Brian Paul wrote: >> > On 07/17/2014 12:04 PM, Jason Ekstrand wrote: >> >> >> >> This is the first installment of some work I've been doing over the >> >> past >> >> couple of weeks to refactor mesa's texture conversion/storage code. >> >> There >> >> is more to be done and more that I have done but have not included in >> >> this >> >> series. This is the first mailing-list-ready fruits of my efforts. >> >> The >> >> important bits here include: >> >> >> >> 1) Using a human-readable CSV file to describe texture formats >> >> similar >> >> to >> >> the way it is currently don in gallium. This is much easier to >> >> read/edit than the structure in formats.c. The guts of formats.c >> >> is >> >> then autogenerated from this CSV file. >> > >> > >> > I'm kind of on the fence about this. Some of us have been hoping that >> > we'd >> > eventually consolidate some of the Mesa and gallium code so we wouldn't >> > have >> > duplicated/parallel code. The format code is a good example. In >> > theory, >> > the MESA_FORMAT_ stuff could be replaced by the gallium PIPE_FORMAT_ >> > code. >> >> I agree and I think this rework brings us closer to sharing the code, >> which is a good thing. In the meantime, it nicely cleans up the code. > > > In that vein, I've taken a slightly different tack that I think will make > the sharing easier. Right now, I'm working on moving a bunch of code into a > new static library called libformat that can be independently developed and > *tested*. One of the problems we had before was that the only > format-conversion testing we did was through piglit which can't actually hit > all of the code-paths. Hopefully, splitting it out will make it easier to > test and easier to combine efforts than gallium depending on mesa or the > other way around. > >> >> >> BTW, this series will conflict with the big endian work that Richard >> Sandiford is doing. > > > Yeah, I'm aware of that. > > --Jason Ekstrand > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_query_lod
Sorry, I didn't know you were the author. Feel free to revert the commit if you like. Marek On Wed, Jul 23, 2014 at 8:19 PM, Matt Turner wrote: > On Wed, Jul 23, 2014 at 10:19 AM, Marek Olšák wrote: >> I don't see your NAK and Ian gave me an ACK. Besides, what would the >> tests be good for if you couldn't do any regression testing because >> they would always fail due to precision issues? > > Please don't top quote. > > Ian reviewed before my comments. Since I'm the author of the tests had > serious doubts, I don't think his R-b is sufficient. The thread ended > with you saying you didn't have any explanation for why the hardware > didn't return the expected values -- that doesn't seem sufficient. The > patch also seems really suspect because it modifies only some of the > tests, and does stuff like loosening the tolerance of the nearest > filtering tests which should only return integral values. > > We've had cases like this in the past that turned out to be driver > bugs that we wouldn't have discovered if we'd papered over the test > failures. (In particular a case where you wanted to loosen the > restrictions of an MSAA test, and after Paul NAK'd it you discovered > that the sample positions used by r600 were wrong, IIRC) > > I'm okay loosening tolerances if we can come up with reasonable > explanations why, but that was definitely not the case here. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
From: Frost --- src/mesa/main/bufferobj.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index 7b1bba0..00f2604 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, ctx->Shared->NullBufferObj); + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, + ctx->Shared->NullBufferObj); + _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, ctx->Shared->NullBufferObj); @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) ctx->UniformBufferBindings[i].Offset = -1; ctx->UniformBufferBindings[i].Size = -1; } + + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { + _mesa_reference_buffer_object(ctx, +&ctx->AtomicBufferBindings[i].BufferObject, +ctx->Shared->NullBufferObj); + ctx->AtomicBufferBindings[i].Offset = -1; + ctx->AtomicBufferBindings[i].Size = -1; + } } @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); + _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL); for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) &ctx->UniformBufferBindings[i].BufferObject, NULL); } + + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { + _mesa_reference_buffer_object(ctx, +&ctx->AtomicBufferBindings[i].BufferObject, +NULL); + } } bool @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids) _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); } + /* unbind Atomic Buffers binding points */ + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { + if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) { + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 ); + } + } + + if (ctx->AtomicBuffer == bufObj) { +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); + } + /* unbind any pixel pack/unpack pointers bound to this buffer */ if (ctx->Pack.BufferObj == bufObj) { _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/17] i965: Split gen6 renderbuffer surface state from gen5 and older
On Friday, July 18, 2014 02:16:36 PM Jordan Justen wrote: > We will program the gen6 renderbuffer surface state differently to > enable layered rendering on gen6. > > Signed-off-by: Jordan Justen > Reviewed-by: Topi Pohjolainen > --- > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_context.c| 4 + > src/mesa/drivers/dri/i965/brw_state.h | 3 + > src/mesa/drivers/dri/i965/gen6_surface_state.c | 152 > + > 4 files changed, 160 insertions(+) > create mode 100644 src/mesa/drivers/dri/i965/gen6_surface_state.c > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources > b/src/mesa/drivers/dri/i965/Makefile.sources > index e235679..43e3378 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.sources > +++ b/src/mesa/drivers/dri/i965/Makefile.sources > @@ -130,6 +130,7 @@ i965_FILES = \ > gen6_scissor_state.c \ > gen6_sf_state.c \ > gen6_sol.c \ > + gen6_surface_state.c \ > gen6_urb.c \ > gen6_viewport_state.c \ > gen6_vs_state.c \ > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index c47ad36..dbe68a8 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -645,6 +645,10 @@ brwCreateContext(gl_api api, >gen7_init_vtable_surface_functions(brw); >gen7_init_vtable_sampler_functions(brw); >brw->vtbl.emit_depth_stencil_hiz = gen7_emit_depth_stencil_hiz; > + } else if (brw->gen >= 6) { > + gen6_init_vtable_surface_functions(brw); > + gen4_init_vtable_sampler_functions(brw); > + brw->vtbl.emit_depth_stencil_hiz = brw_emit_depth_stencil_hiz; > } else { >gen4_init_vtable_surface_functions(brw); >gen4_init_vtable_sampler_functions(brw); > diff --git a/src/mesa/drivers/dri/i965/brw_state.h > b/src/mesa/drivers/dri/i965/brw_state.h > index 6f1db6c..8e176f3 100644 > --- a/src/mesa/drivers/dri/i965/brw_state.h > +++ b/src/mesa/drivers/dri/i965/brw_state.h > @@ -262,6 +262,9 @@ calculate_attr_overrides(const struct brw_context *brw, > uint32_t *flat_enables, > uint32_t *urb_entry_read_length); > > +/* gen6_surface_state.c */ > +void gen6_init_vtable_surface_functions(struct brw_context *brw); > + > /* brw_vs_surface_state.c */ > void > brw_upload_pull_constants(struct brw_context *brw, > diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c > b/src/mesa/drivers/dri/i965/gen6_surface_state.c > new file mode 100644 > index 000..9fec372 > --- /dev/null > +++ b/src/mesa/drivers/dri/i965/gen6_surface_state.c > @@ -0,0 +1,152 @@ > +/* > + * Copyright (c) 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > + > +#include "main/context.h" > +#include "main/blend.h" > +#include "main/mtypes.h" > +#include "main/samplerobj.h" > +#include "program/prog_parameter.h" > + > +#include "intel_mipmap_tree.h" > +#include "intel_batchbuffer.h" > +#include "intel_tex.h" > +#include "intel_fbo.h" > +#include "intel_buffer_objects.h" > + > +#include "brw_context.h" > +#include "brw_state.h" > +#include "brw_defines.h" > +#include "brw_wm.h" > + > +/** > + * Sets up a surface state structure to point at the given region. > + * While it is only used for the front/back buffer currently, it should be > + * usable for further buffers when doing ARB_draw_buffer support. > + */ > +static void > +gen6_update_renderbuffer_surface(struct brw_context *brw, > + struct gl_renderbuffer *rb, > + bool layered, > + unsigned int unit) > +{ > + struct gl_context *ctx = &brw->ctx; > + struct intel_renderbuffer *irb = intel_renderbuffer(rb); > + struct intel_mipmap_tree *mt =
Re: [Mesa-dev] [PATCH 02/17] i965/gen6: add support for layered renderbuffers
On Friday, July 18, 2014 02:16:37 PM Jordan Justen wrote: > Rather than pointing the surface_state directly at a single > sub-image of the texture for rendering, we now point the > surface_state at the top level of the texture, and configure > the surface_state as needed based on this. > > v2: > * Use SET_FIELD as suggested by Topi > * Simplify min_array_element assignment as suggested by Topi > > Signed-off-by: Jordan Justen > --- > src/mesa/drivers/dri/i965/brw_defines.h| 4 ++ > src/mesa/drivers/dri/i965/gen6_surface_state.c | 76 > -- > 2 files changed, 40 insertions(+), 40 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h > index 8b73c5c..fa39e4e 100644 > --- a/src/mesa/drivers/dri/i965/brw_defines.h > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > @@ -548,6 +548,10 @@ > /* Surface state DW4 */ > #define BRW_SURFACE_MIN_LOD_SHIFT28 > #define BRW_SURFACE_MIN_LOD_MASK INTEL_MASK(31, 28) > +#define BRW_SURFACE_MIN_ARRAY_ELEMENT_SHIFT 17 > +#define BRW_SURFACE_MIN_ARRAY_ELEMENT_MASK INTEL_MASK(27, 17) > +#define BRW_SURFACE_RENDER_TARGET_VIEW_EXTENT_SHIFT 8 > +#define BRW_SURFACE_RENDER_TARGET_VIEW_EXTENT_MASK INTEL_MASK(16, 8) > #define BRW_SURFACE_MULTISAMPLECOUNT_1 (0 << 4) > #define BRW_SURFACE_MULTISAMPLECOUNT_4 (2 << 4) > #define GEN7_SURFACE_MULTISAMPLECOUNT_1 (0 << 3) > diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c > b/src/mesa/drivers/dri/i965/gen6_surface_state.c > index 9fec372..6fc8bdf 100644 > --- a/src/mesa/drivers/dri/i965/gen6_surface_state.c > +++ b/src/mesa/drivers/dri/i965/gen6_surface_state.c > @@ -26,6 +26,7 @@ > #include "main/blend.h" > #include "main/mtypes.h" > #include "main/samplerobj.h" > +#include "main/texformat.h" > #include "program/prog_parameter.h" > > #include "intel_mipmap_tree.h" > @@ -54,30 +55,17 @@ gen6_update_renderbuffer_surface(struct brw_context *brw, > struct intel_renderbuffer *irb = intel_renderbuffer(rb); > struct intel_mipmap_tree *mt = irb->mt; > uint32_t *surf; > - uint32_t tile_x, tile_y; > uint32_t format = 0; > /* _NEW_BUFFERS */ > mesa_format rb_format = _mesa_get_render_format(ctx, > intel_rb_format(irb)); > + uint32_t surftype; > + int depth = MAX2(rb->Depth, 1); I second Topi's feedback here: irb->layer_count would be better. > + GLenum gl_target = rb->TexImage ? > + rb->TexImage->TexObject->Target : GL_TEXTURE_2D; > + > uint32_t surf_index = >brw->wm.prog_data->binding_table.render_target_start + unit; > > - assert(!layered); > - > - if (rb->TexImage && !brw->has_surface_tile_offset) { > - intel_renderbuffer_get_tile_offsets(irb, &tile_x, &tile_y); > - > - if (tile_x != 0 || tile_y != 0) { > - /* Original gen4 hardware couldn't draw to a non-tile-aligned > - * destination in a miptree unless you actually setup your renderbuffer > - * as a miptree and used the fragile lod/array_index/etc. controls to > - * select the image. So, instead, we just make a new single-level > - * miptree and render into that. > - */ > - intel_renderbuffer_move_to_temp(brw, irb, false); > - mt = irb->mt; > - } > - } > - > intel_miptree_used_for_rendering(irb->mt); > > surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, > @@ -89,30 +77,38 @@ gen6_update_renderbuffer_surface(struct brw_context *brw, > __FUNCTION__, _mesa_get_format_name(rb_format)); > } > > - surf[0] = (BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT | > - format << BRW_SURFACE_FORMAT_SHIFT); > + switch (gl_target) { > + case GL_TEXTURE_CUBE_MAP_ARRAY: > + case GL_TEXTURE_CUBE_MAP: > + surftype = BRW_SURFACE_2D; > + depth *= 6; > + break; > + default: > + surftype = translate_tex_target(gl_target); > + break; > + } > + > + const int min_array_element = layered ? 0 : irb->mt_layer; > + > + surf[0] = SET_FIELD(surftype, BRW_SURFACE_TYPE) | > + SET_FIELD(format, BRW_SURFACE_FORMAT); > > /* reloc */ > - surf[1] = (intel_renderbuffer_get_tile_offsets(irb, &tile_x, &tile_y) + > - mt->bo->offset64); > - > - surf[2] = ((rb->Width - 1) << BRW_SURFACE_WIDTH_SHIFT | > - (rb->Height - 1) << BRW_SURFACE_HEIGHT_SHIFT); > - > - surf[3] = (brw_get_surface_tiling_bits(mt->tiling) | > - (mt->pitch - 1) << BRW_SURFACE_PITCH_SHIFT); > - > - surf[4] = brw_get_surface_num_multisamples(mt->num_samples); > - > - assert(brw->has_surface_tile_offset || (tile_x == 0 && tile_y == 0)); > - /* Note that the low bits of these fields are missing, so > -* there's the possibility of getting in trouble. > -*/ > - assert(tile_x % 4 == 0); > - assert(tile_y % 2 == 0); > - surf[5] = ((tile_x / 4) << BRW_SURFACE_X_OFFSET_SHIFT | > - (tile_y / 2) << BRW_SURFACE_Y_OFFSET_SHIFT | >
[Mesa-dev] [Bug 80848] [dri3] Building mesa fails with dri3 enabled
https://bugs.freedesktop.org/show_bug.cgi?id=80848 --- Comment #10 from Emil Velikov --- (In reply to comment #9) > That patch makes no difference for me. Hmm in that case I would suspect that you're hit by a more elaborate (and different) bug than the one reported by Juha-Pekka. If the original link succeeds and we fail at the re-linking stage, that indicates that the LDFLAGS|LIBS are somewhat bonkers. Which one(s) I'm not sure :\ There was a mildly related issue here [1]. You're building without llvm so I'm not sure it will help, still worth a shot though. *Apply on top of the patch in comment 7. Cheers, Emil [1] https://bugs.gentoo.org/show_bug.cgi?id=501328#c47 Would be great if Juha-Pekka can confirm if the my patch helps in his case or not. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Add an accelerated version of F_TO_I for x86_64
According to a quick micro-benchmark, this new version is 20% faster on my Haswell laptop. v2: Removed the XXX note about x86_64 from the comment v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC support for free. Signed-off-by: Jason Ekstrand --- src/mesa/main/imports.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index af780b2..6eb84ca 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -274,10 +274,12 @@ static inline int IROUND_POS(float f) return (int) (f + 0.5F); } +#if defined(USE_X86_64_ASM) +# include +#endif /** * Convert float to int using a fast method. The rounding mode may vary. - * XXX We could use an x86-64/SSE2 version here. */ static inline int F_TO_I(float f) { @@ -292,6 +294,8 @@ static inline int F_TO_I(float f) fistp r } return r; +#elif defined(USE_X86_64_ASM) + return _mm_cvt_ss2si(_mm_load_ss(&f)); #else return IROUND(f); #endif -- 2.0.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/17] i965: Split gen6 depth hiz state out from brw
On Tuesday, July 22, 2014 11:53:16 AM Pohjolainen, Topi wrote: > On Fri, Jul 18, 2014 at 02:16:39PM -0700, Jordan Justen wrote: > > We will program the gen6 hiz depth state differently to enable layered > > rendering on gen6. > > > > v2: > > * Remove unneeded gen6_emit_depthbuffer as suggested by Topi > > > > Signed-off-by: Jordan Justen > > Compared side by side with brw_emit_depth_stencil_hiz() and looks identical. > I was hoping we could start merging gen6-gen8 surface state logic as there > seems to be quite a bit of overlap. This change adds even more but I think > it is still a step forward as now gen6 is closer to gen7/8. > Proper refactoring between different generations requires some more thinking > and possibly some trial-and-error too. This patch will make that work easier > in my opinion. > > Reviewed-by: Topi Pohjolainen I'm really glad to see this split - I did something similar a long time ago, but it conflicted with Paul's other works, so I never landed it. Either that or I never quite got it working. :) That said, you really should clean up both halves of the split: delete the Gen6 checks from brw_emit_depth_stencil_hiz, and delete the Gen4-5 checks from gen6_emit_depth_stencil_hiz. You could do that here, or in a patch or two immediately following the copy-and-paste. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/17] i965/gen6 depth surface: calculate more specific surface type
On Tuesday, July 22, 2014 12:19:48 PM Pohjolainen, Topi wrote: > On Fri, Jul 18, 2014 at 02:16:40PM -0700, Jordan Justen wrote: > > (171e633 for gen6) > > > > This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. > > > > Note: Cube maps are treated as 2D arrays with 6 times as > > many array elements as the cube map array would have. > > > > Signed-off-by: Jordan Justen > > --- > > src/mesa/drivers/dri/i965/gen6_blorp.cpp | 17 ++ > > src/mesa/drivers/dri/i965/gen6_depth_state.c | 33 > > > > 2 files changed, 50 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > > b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > > index eb865b9..3fc36aa 100644 > > --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > > +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > > @@ -791,6 +791,23 @@ gen6_blorp_emit_depth_stencil_config(struct > > brw_context *brw, > > uint32_t draw_x = params->depth.x_offset; > > uint32_t draw_y = params->depth.y_offset; > > uint32_t tile_mask_x, tile_mask_y; > > + uint32_t surftype; > > + GLenum gl_target = params->depth.mt->target; > > + > > + switch (gl_target) { > > + case GL_TEXTURE_CUBE_MAP_ARRAY: > > + case GL_TEXTURE_CUBE_MAP: > > + /* The PRM claims that we should use BRW_SURFACE_CUBE for this > > + * situation, but experiments show that gl_Layer doesn't work when > > we do > > + * this. So we use BRW_SURFACE_2D, since for rendering purposes > > this is > > + * equivalent. > > + */ > > + surftype = BRW_SURFACE_2D; > > + break; > > + default: > > + surftype = translate_tex_target(gl_target); > > + break; > > + } > > Patches 5-7 look identical to the gen7 equivalent and hence in principle: > > Reviewed-by: Topi Pohjolainen > > But having said that I think we need to start planning how to merge all the > duplicate surface state logic between gen6-gen8. For example, the switch > statement above can be now found twice per generation. Agreed. I'd like to see this work land - split Gen4-5/Gen6 out - then look at combining things in a better way. Patches 5-7 are: Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
Please see: http://lists.freedesktop.org/archives/mesa-dev/2014-July/062818.html http://lists.freedesktop.org/archives/mesa-dev/2014-July/063798.html Also, your git username and address are wrong. You can set them with git config. Marek On Wed, Jul 23, 2014 at 8:27 PM, Aditya Atluri wrote: > From: Frost > > --- > src/mesa/main/bufferobj.c | 30 ++ > 1 file changed, 30 insertions(+) > > diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c > index 7b1bba0..00f2604 100644 > --- a/src/mesa/main/bufferobj.c > +++ b/src/mesa/main/bufferobj.c > @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) > _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, > ctx->Shared->NullBufferObj); > > + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, > + ctx->Shared->NullBufferObj); > + > _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, > ctx->Shared->NullBufferObj); > > @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) >ctx->UniformBufferBindings[i].Offset = -1; >ctx->UniformBufferBindings[i].Size = -1; > } > + > + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { > + _mesa_reference_buffer_object(ctx, > + > &ctx->AtomicBufferBindings[i].BufferObject, > +ctx->Shared->NullBufferObj); > + ctx->AtomicBufferBindings[i].Offset = -1; > + ctx->AtomicBufferBindings[i].Size = -1; > + } > } > > > @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) > > _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); > > + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); > + > _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL); > > for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { > @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) > > &ctx->UniformBufferBindings[i].BufferObject, > NULL); > } > + > + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { > + _mesa_reference_buffer_object(ctx, > + > &ctx->AtomicBufferBindings[i].BufferObject, > +NULL); > + } > } > > bool > @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids) > _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); > } > > + /* unbind Atomic Buffers binding points */ > + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { > + if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) { > + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 ); > + } > + } > + > + if (ctx->AtomicBuffer == bufObj) { > +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); > + } > + > /* unbind any pixel pack/unpack pointers bound to this buffer */ > if (ctx->Pack.BufferObj == bufObj) { > _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); > -- > 1.7.9.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
On Wed, Jul 23, 2014 at 3:22 PM, Marek Olšák wrote: > Please see: > > http://lists.freedesktop.org/archives/mesa-dev/2014-July/062818.html > http://lists.freedesktop.org/archives/mesa-dev/2014-July/063798.html > > Also, your git username and address are wrong. You can set them with git > config. In addition to these more basic issues... is this patch needed at all? I thought the core code was all done and it was just the mesa/st + gallium interfaces that needed to be fixed up. > > Marek > > On Wed, Jul 23, 2014 at 8:27 PM, Aditya Atluri > wrote: >> From: Frost >> >> --- >> src/mesa/main/bufferobj.c | 30 ++ >> 1 file changed, 30 insertions(+) >> >> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c >> index 7b1bba0..00f2604 100644 >> --- a/src/mesa/main/bufferobj.c >> +++ b/src/mesa/main/bufferobj.c >> @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) >> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, >> ctx->Shared->NullBufferObj); >> >> + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, >> + ctx->Shared->NullBufferObj); >> + >> _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, >> ctx->Shared->NullBufferObj); >> >> @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) >>ctx->UniformBufferBindings[i].Offset = -1; >>ctx->UniformBufferBindings[i].Size = -1; >> } >> + >> + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { >> + _mesa_reference_buffer_object(ctx, >> + >> &ctx->AtomicBufferBindings[i].BufferObject, >> +ctx->Shared->NullBufferObj); >> + ctx->AtomicBufferBindings[i].Offset = -1; >> + ctx->AtomicBufferBindings[i].Size = -1; >> + } >> } >> >> >> @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) >> >> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); >> >> + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); >> + >> _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL); >> >> for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { >> @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) >> >> &ctx->UniformBufferBindings[i].BufferObject, >> NULL); >> } >> + >> + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { >> + _mesa_reference_buffer_object(ctx, >> + >> &ctx->AtomicBufferBindings[i].BufferObject, >> +NULL); >> + } >> } >> >> bool >> @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids) >> _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); >> } >> >> + /* unbind Atomic Buffers binding points */ >> + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { >> + if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) { >> + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 ); >> + } >> + } >> + >> + if (ctx->AtomicBuffer == bufObj) { >> +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); >> + } >> + >> /* unbind any pixel pack/unpack pointers bound to this buffer */ >> if (ctx->Pack.BufferObj == bufObj) { >> _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); >> -- >> 1.7.9.5 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
I thought so too, but these bits are really missing there, e.g. glDeleteBuffers doesn't unbind atomic buffers, etc. Marek On Wed, Jul 23, 2014 at 9:25 PM, Ilia Mirkin wrote: > On Wed, Jul 23, 2014 at 3:22 PM, Marek Olšák wrote: >> Please see: >> >> http://lists.freedesktop.org/archives/mesa-dev/2014-July/062818.html >> http://lists.freedesktop.org/archives/mesa-dev/2014-July/063798.html >> >> Also, your git username and address are wrong. You can set them with git >> config. > > In addition to these more basic issues... is this patch needed at all? > I thought the core code was all done and it was just the mesa/st + > gallium interfaces that needed to be fixed up. > >> >> Marek >> >> On Wed, Jul 23, 2014 at 8:27 PM, Aditya Atluri >> wrote: >>> From: Frost >>> >>> --- >>> src/mesa/main/bufferobj.c | 30 ++ >>> 1 file changed, 30 insertions(+) >>> >>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c >>> index 7b1bba0..00f2604 100644 >>> --- a/src/mesa/main/bufferobj.c >>> +++ b/src/mesa/main/bufferobj.c >>> @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) >>> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, >>> ctx->Shared->NullBufferObj); >>> >>> + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, >>> + ctx->Shared->NullBufferObj); >>> + >>> _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, >>> ctx->Shared->NullBufferObj); >>> >>> @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) >>>ctx->UniformBufferBindings[i].Offset = -1; >>>ctx->UniformBufferBindings[i].Size = -1; >>> } >>> + >>> + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { >>> + _mesa_reference_buffer_object(ctx, >>> + >>> &ctx->AtomicBufferBindings[i].BufferObject, >>> +ctx->Shared->NullBufferObj); >>> + ctx->AtomicBufferBindings[i].Offset = -1; >>> + ctx->AtomicBufferBindings[i].Size = -1; >>> + } >>> } >>> >>> >>> @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) >>> >>> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); >>> >>> + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); >>> + >>> _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL); >>> >>> for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { >>> @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) >>> >>> &ctx->UniformBufferBindings[i].BufferObject, >>> NULL); >>> } >>> + >>> + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { >>> + _mesa_reference_buffer_object(ctx, >>> + >>> &ctx->AtomicBufferBindings[i].BufferObject, >>> +NULL); >>> + } >>> } >>> >>> bool >>> @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids) >>> _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); >>> } >>> >>> + /* unbind Atomic Buffers binding points */ >>> + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { >>> + if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) { >>> + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 ); >>> + } >>> + } >>> + >>> + if (ctx->AtomicBuffer == bufObj) { >>> +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); >>> + } >>> + >>> /* unbind any pixel pack/unpack pointers bound to this buffer */ >>> if (ctx->Pack.BufferObj == bufObj) { >>> _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); >>> -- >>> 1.7.9.5 >>> >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Implement BPTC texture support
On Wed, Jul 23, 2014 at 1:16 PM, Glenn Kennard wrote: > Signed-off-by: Glenn Kennard > --- > This patch depends on Ilia Mirkin's "nvc0: add BPTC format support" > and Neil Robert's core BPTC support patches. > > src/gallium/drivers/r600/r600_state_common.c | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/src/gallium/drivers/r600/r600_state_common.c > b/src/gallium/drivers/r600/r600_state_common.c > index 8c37d0d..2f39df3 100644 > --- a/src/gallium/drivers/r600/r600_state_common.c > +++ b/src/gallium/drivers/r600/r600_state_common.c > @@ -1967,6 +1967,29 @@ uint32_t r600_translate_texformat(struct pipe_screen > *screen, > } > } > > + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { > + if (!enable_s3tc) > + goto out_unknown; > + > + if (rscreen->b.chip_class < EVERGREEN) > + goto out_unknown; > + > + switch (format) { > + case PIPE_FORMAT_BPTC_RGBA_UNORM: > + case PIPE_FORMAT_BPTC_SRGBA_UNORM: > + result = FMT_BC7; > + is_srgb_valid = TRUE; > + goto out_word4; > + case PIPE_FORMAT_BPTC_RGB_FLOAT: > + case PIPE_FORMAT_BPTC_RGB_UFLOAT: > + result = FMT_BC6; > + is_srgb_valid = TRUE; BC6 shouldn't be "srgb_valid". It won't have any effect though, because we don't have any formats which are both float and srgb. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: implement BPTC texture support
Passes corrected piglit test and should also handle signed vs unsigned float correctly. --- src/gallium/drivers/radeonsi/si_state.c | 20 1 file changed, 20 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 3dec536..6b64e7c 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1102,6 +1102,22 @@ static uint32_t si_translate_texformat(struct pipe_screen *screen, } } + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { + if (!enable_s3tc) + goto out_unknown; + + switch (format) { + case PIPE_FORMAT_BPTC_RGBA_UNORM: + case PIPE_FORMAT_BPTC_SRGBA_UNORM: + return V_008F14_IMG_DATA_FORMAT_BC7; + case PIPE_FORMAT_BPTC_RGB_FLOAT: + case PIPE_FORMAT_BPTC_RGB_UFLOAT: + return V_008F14_IMG_DATA_FORMAT_BC6; + default: + goto out_unknown; + } + } + if (desc->layout == UTIL_FORMAT_LAYOUT_SUBSAMPLED) { switch (format) { case PIPE_FORMAT_R8G8_B8G8_UNORM: @@ -2467,12 +2483,16 @@ static struct pipe_sampler_view *si_create_sampler_view(struct pipe_context *ctx case PIPE_FORMAT_DXT1_SRGBA: case PIPE_FORMAT_DXT3_SRGBA: case PIPE_FORMAT_DXT5_SRGBA: + case PIPE_FORMAT_BPTC_SRGBA_UNORM: num_format = V_008F14_IMG_NUM_FORMAT_SRGB; break; case PIPE_FORMAT_RGTC1_SNORM: case PIPE_FORMAT_LATC1_SNORM: case PIPE_FORMAT_RGTC2_SNORM: case PIPE_FORMAT_LATC2_SNORM: + /* implies float, so use SNORM/UNORM to determine + whether data is signed or not */ + case PIPE_FORMAT_BPTC_RGB_FLOAT: num_format = V_008F14_IMG_NUM_FORMAT_SNORM; break; default: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] glsl/glsl_parser.yy: vs12 doesn't have strcasecmp, use _stricmp instead
On 07/22/2014 02:07 PM, Alon Levy wrote: > Signed-off-by: Alon Levy > --- > src/glsl/glsl_parser.yy | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy > index faaf438..25370cd 100644 > --- a/src/glsl/glsl_parser.yy > +++ b/src/glsl/glsl_parser.yy > @@ -26,6 +26,10 @@ > #include > #include > > +#ifdef _MSC_VER <= 1800 > +#define strcasecmp _stricmp > +#endif > + glsl_parser.yy should already get the strcasecmp work around from src/mesa/main/imports.h. > #include "ast.h" > #include "glsl_parser_extras.h" > #include "glsl_types.h" ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: implement BPTC texture support
Reviewed-by: Marek Olšák Marek On Wed, Jul 23, 2014 at 11:00 PM, Grigori Goronzy wrote: > Passes corrected piglit test and should also handle signed vs unsigned > float correctly. > --- > src/gallium/drivers/radeonsi/si_state.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/src/gallium/drivers/radeonsi/si_state.c > b/src/gallium/drivers/radeonsi/si_state.c > index 3dec536..6b64e7c 100644 > --- a/src/gallium/drivers/radeonsi/si_state.c > +++ b/src/gallium/drivers/radeonsi/si_state.c > @@ -1102,6 +1102,22 @@ static uint32_t si_translate_texformat(struct > pipe_screen *screen, > } > } > > + if (desc->layout == UTIL_FORMAT_LAYOUT_BPTC) { > + if (!enable_s3tc) > + goto out_unknown; > + > + switch (format) { > + case PIPE_FORMAT_BPTC_RGBA_UNORM: > + case PIPE_FORMAT_BPTC_SRGBA_UNORM: > + return V_008F14_IMG_DATA_FORMAT_BC7; > + case PIPE_FORMAT_BPTC_RGB_FLOAT: > + case PIPE_FORMAT_BPTC_RGB_UFLOAT: > + return V_008F14_IMG_DATA_FORMAT_BC6; > + default: > + goto out_unknown; > + } > + } > + > if (desc->layout == UTIL_FORMAT_LAYOUT_SUBSAMPLED) { > switch (format) { > case PIPE_FORMAT_R8G8_B8G8_UNORM: > @@ -2467,12 +2483,16 @@ static struct pipe_sampler_view > *si_create_sampler_view(struct pipe_context *ctx > case PIPE_FORMAT_DXT1_SRGBA: > case PIPE_FORMAT_DXT3_SRGBA: > case PIPE_FORMAT_DXT5_SRGBA: > + case PIPE_FORMAT_BPTC_SRGBA_UNORM: > num_format = > V_008F14_IMG_NUM_FORMAT_SRGB; > break; > case PIPE_FORMAT_RGTC1_SNORM: > case PIPE_FORMAT_LATC1_SNORM: > case PIPE_FORMAT_RGTC2_SNORM: > case PIPE_FORMAT_LATC2_SNORM: > + /* implies float, so use SNORM/UNORM to > determine > + whether data is signed or not */ > + case PIPE_FORMAT_BPTC_RGB_FLOAT: > num_format = > V_008F14_IMG_NUM_FORMAT_SNORM; > break; > default: > -- > 1.8.3.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/16] glsl: Track matrix layout of structure fields using two bits
On 07/21/2014 03:17 PM, Matt Turner wrote: > On Mon, Jul 21, 2014 at 2:04 PM, Ian Romanick wrote: >> +enum glsl_matrix_layout { >> + GLSL_MATRIX_LAYOUT_DEFAULT, > > Does this mean language-default, or does it really means the inherited > layout? E.g., for > > layout(row_major) uniform a { >mat4 m; > }; > > m's .matrix_layout is GLSL_MATRIX_LAYOUT_DEFAULT, so we look to the > outer row_major qualifier on uniform a? Correct. If some entity inside a block has GLSL_MATRIX_LAYOUT_DEFAULT then it either is not (or cannot contain) a matrix, or its layout is inherited from the next outer container. The interface type itself will never have GLSL_MATRIX_LAYOUT_DEFAULT. > If so, could we name it _INHERITED or something? That works. I think I like that better than _NOT_SET. The layout may not be set for the interface type, but it will always have a layout of either GLSL_MATRIX_LAYOUT_COLUMN_MAJOR or GLSL_MATRIX_LAYOUT_ROW_MAJOR. > Maybe I've misunderstood. > >> + GLSL_MATRIX_LAYOUT_COLUMN_MAJOR, >> + GLSL_MATRIX_LAYOUT_ROW_MAJOR >> +}; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] Add the format enums for BPTC-compressed images
On 07/22/2014 12:09 PM, Neil Roberts wrote: > diff --git a/src/mesa/main/texcompress.c b/src/mesa/main/texcompress.c > index 9dbfe9f..b708b49 100644 > --- a/src/mesa/main/texcompress.c > +++ b/src/mesa/main/texcompress.c > @@ -235,6 +235,12 @@ _mesa_gl_compressed_format_base_format(GLenum format) > * GL_EXT_texture_compression_latc. At the very least, Catalyst 11.6 does > not > * expose the 3dc formats through this mechanism. > * > + * The spec for GL_ARB_texture_compression_bptc doesn't mention whether it > + * should be included in GL_COMPRESSED_TEXTURE_FORMATS. However as it takes a > + * very long time to compress the textures in this format it's probably not > + * very useful as a general format where the GL will have to compress it on > + * the fly. > + * What do NVIDIA and AMD do? We should mimic that. > * \param ctx the GL context > * \param formats the resulting format list (may be NULL). > * ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] Add support for BPTC texture compression
On 07/22/2014 12:09 PM, Neil Roberts wrote: > Here's a first attempt at a patch series to implement BPTC texture > compression in the i965 driver on Gen>=7. > > Getting it to work on the hardware is pretty trivial as it's just a > case of adding some new Mesa format enums and then plugging them > together with the right Intel surface type. However GL requires that > you are able to get the library to compress textures on the fly so we > need a compressor too. I think for BPTC it doesn't really make much > sense to actually use this because it takes a very long time to search > the entire space and compress an image properly. For example the > NVidia compressor takes in the order of an hour for a full-screen > image. Instead I've just done the minimal work needed to get something > that gives vaguely passable results. Is that NVIDIA's off-line compression tool, or is that the compressor in the driver? A brute-force compressor will be very, very slow for BPTC. There are other approaches that are much faster without sacrificing very much quality. https://software.intel.com/en-us/vcsource/samples/fast-texture-compression An implementation of this algorithm is available. https://github.com/Mokosha/FasTC It seems like we could just link with the compression libraries produced by that package. Though, the license ("Permission to incorporate this software into commercial products may be obtained by contacting the authors or the Office of Technology Development at the University of North Carolina at Chapel Hill .") may be problematic? Note the FasTC project also supports ASTC, and we will need to support that format eventually also. > For the two normalized formats I've just made it first compress each > block with the existing DXT3 compressor and then convert the block to > mode 4 of BPTC. This ends up being worse than just using DXT3 directly > because it will lose a bit from the green component of the endpoints > and each alpha index will be 3 bits instead of 4, but it looks ok. This means that we can only enable BPTC when libtxc_dxtn.so is available... which means patch 5 needs some changes. :( > For the two half-float formats I've written a custom compressor which > just has a very simple algorithm and always uses mode 3. > > I've also written software texel fetch functions for all of the > formats. I guess in theory we don't need these because we should just > be able to use the hardware to decompress if someone calls > glGetTexImage. However Mesa has a static assert to require a texel > fetch function and it seemed like a good way to learn more about the > format so I wrote the functions anyway. It also means we can enable > the extension on the software rasterizer. > > I've written a Piglit test for the decompressor with the normalized > formats. I also tested the half-float decompressor using NVidia's > sample texture which tries every mode. It would be good to make a > Piglit test that does this as well. > > - Neil > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
On 07/23/2014 12:39 PM, Marek Olšák wrote: > I thought so too, but these bits are really missing there, e.g. > glDeleteBuffers doesn't unbind atomic buffers, etc. D'oh. It sounds like we need some piglit tests and probably some spec quotations. :( > Marek > > On Wed, Jul 23, 2014 at 9:25 PM, Ilia Mirkin wrote: >> On Wed, Jul 23, 2014 at 3:22 PM, Marek Olšák wrote: >>> Please see: >>> >>> http://lists.freedesktop.org/archives/mesa-dev/2014-July/062818.html >>> http://lists.freedesktop.org/archives/mesa-dev/2014-July/063798.html >>> >>> Also, your git username and address are wrong. You can set them with git >>> config. >> >> In addition to these more basic issues... is this patch needed at all? >> I thought the core code was all done and it was just the mesa/st + >> gallium interfaces that needed to be fixed up. >> >>> >>> Marek >>> >>> On Wed, Jul 23, 2014 at 8:27 PM, Aditya Atluri >>> wrote: From: Frost --- src/mesa/main/bufferobj.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index 7b1bba0..00f2604 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, ctx->Shared->NullBufferObj); + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, + ctx->Shared->NullBufferObj); + _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, ctx->Shared->NullBufferObj); @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx ) ctx->UniformBufferBindings[i].Offset = -1; ctx->UniformBufferBindings[i].Size = -1; } + + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { + _mesa_reference_buffer_object(ctx, + &ctx->AtomicBufferBindings[i].BufferObject, +ctx->Shared->NullBufferObj); + ctx->AtomicBufferBindings[i].Offset = -1; + ctx->AtomicBufferBindings[i].Size = -1; + } } @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); + _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL); for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx ) &ctx->UniformBufferBindings[i].BufferObject, NULL); } + + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { + _mesa_reference_buffer_object(ctx, + &ctx->AtomicBufferBindings[i].BufferObject, +NULL); + } } bool @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids) _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); } + /* unbind Atomic Buffers binding points */ + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { + if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) { + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, 0 ); + } + } + + if (ctx->AtomicBuffer == bufObj) { +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); + } + /* unbind any pixel pack/unpack pointers bound to this buffer */ if (ctx->Pack.BufferObj == bufObj) { _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/16] glsl: Add is_{matrix, record, interface}_or_array_of predicates
On 07/21/2014 08:03 PM, Timothy Arceri wrote: > On Mon, 2014-07-21 at 14:04 -0700, Ian Romanick wrote: >> From: Ian Romanick >> >> There are a bunch of places, especially in the UBO code, where we check >> whether something is a matrix (or record) when we actually want to know >> if it a matrix or an array of matrices (ditto for records). > > Hi Ian, > > I sent an alternative to this as part of and arrays of arrays series > (patch 2) back in May [1]. The advantage is that it means adding only > one extra function to glsl_types rather than three, it supports arrays > of arrays and is more generic so can be used for other types. > I guess it may be a little less readable then your alternative but that > could probably be fixed by giving it a better name (its wrong anyway as > it should be innermost not outermost). I kind of like that. I'm not sure about the naming, though. Neither innermost_element_type nor outermost_element_type is a good name. To me, that would imply that ivec4->innermost_element_type() is float, but it's actually vec4. Since this method only peels off array types, the name should communicate that. Maybe type_without_array or just without_array? Then the code would look like: if (var->type->without_array()->is_matrix()) ... > Anyway its just a suggestion. It will be easy enough to add arrays of > arrays support to these functions later on. > I haven't been working on this for a few weeks but I recall that uniform > code you changed seems to be fairly arrays of arrays friendly allowing > my suggestion to be used there [2] > > [1] http://lists.freedesktop.org/archives/mesa-dev/2014-May/059271.html > [2] > https://github.com/tarceri/Mesa_arrays_of_arrays/commit/ba422820d0e8b9944fd3d0278913ae3cfbb184b2 > > >> This will be used in later patches in this series. >> >> Signed-off-by: Ian Romanick >> --- >> src/glsl/glsl_types.h | 24 >> 1 file changed, 24 insertions(+) >> >> diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h >> index 0b63d48..2dfa8dd 100644 >> --- a/src/glsl/glsl_types.h >> +++ b/src/glsl/glsl_types.h >> @@ -354,6 +354,14 @@ struct glsl_type { >> } >> >> /** >> +* Query whether or not a type is a matrix or an array of matrices >> +*/ >> + bool is_matrix_or_array_of() const >> + { >> + return is_matrix() || (is_array() && fields.array->is_matrix()); >> + } >> + >> + /** >> * Query whether or not a type is a non-array numeric type >> */ >> bool is_numeric() const >> @@ -441,6 +449,14 @@ struct glsl_type { >> } >> >> /** >> +* Query whether or not a type is a record or an array of records >> +*/ >> + bool is_record_or_array_of() const >> + { >> + return is_record() || (is_array() && fields.array->is_record()); >> + } >> + >> + /** >> * Query whether or not a type is an interface >> */ >> bool is_interface() const >> @@ -449,6 +465,14 @@ struct glsl_type { >> } >> >> /** >> +* Query whether or not a type is an interface or an array of interfaces >> +*/ >> + bool is_interface_or_array_of() const >> + { >> + return is_interface() || (is_array() && fields.array->is_interface()); >> + } >> + >> + /** >> * Query whether or not a type is the void type singleton. >> */ >> bool is_void() const ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] added functions for binding atomic buffers to extension GL_ARB_shader_atomic_counters for radeonsi and r600 backend
Hi, I am sorry. This is my first patch. I'll correct it for the next time. Do you want me to resend it? On Wednesday, July 23, 2014, Ian Romanick wrote: > On 07/23/2014 12:39 PM, Marek Olšák wrote: > > I thought so too, but these bits are really missing there, e.g. > > glDeleteBuffers doesn't unbind atomic buffers, etc. > > D'oh. It sounds like we need some piglit tests and probably some spec > quotations. :( > > > Marek > > > > On Wed, Jul 23, 2014 at 9:25 PM, Ilia Mirkin > wrote: > >> On Wed, Jul 23, 2014 at 3:22 PM, Marek Olšák > wrote: > >>> Please see: > >>> > >>> http://lists.freedesktop.org/archives/mesa-dev/2014-July/062818.html > >>> http://lists.freedesktop.org/archives/mesa-dev/2014-July/063798.html > >>> > >>> Also, your git username and address are wrong. You can set them with > git config. > >> > >> In addition to these more basic issues... is this patch needed at all? > >> I thought the core code was all done and it was just the mesa/st + > >> gallium interfaces that needed to be fixed up. > >> > >>> > >>> Marek > >>> > >>> On Wed, Jul 23, 2014 at 8:27 PM, Aditya Atluri < > adityaavina...@gmail.com > wrote: > From: Frost > > --- > src/mesa/main/bufferobj.c | 30 ++ > 1 file changed, 30 insertions(+) > > diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c > index 7b1bba0..00f2604 100644 > --- a/src/mesa/main/bufferobj.c > +++ b/src/mesa/main/bufferobj.c > @@ -832,6 +832,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx > ) > _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, > ctx->Shared->NullBufferObj); > > + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, > + ctx->Shared->NullBufferObj); > + > _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, > ctx->Shared->NullBufferObj); > > @@ -842,6 +845,14 @@ _mesa_init_buffer_objects( struct gl_context > *ctx ) > ctx->UniformBufferBindings[i].Offset = -1; > ctx->UniformBufferBindings[i].Size = -1; > } > + > + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { > + _mesa_reference_buffer_object(ctx, > + > &ctx->AtomicBufferBindings[i].BufferObject, > +ctx->Shared->NullBufferObj); > + ctx->AtomicBufferBindings[i].Offset = -1; > + ctx->AtomicBufferBindings[i].Size = -1; > + } > } > > > @@ -857,6 +868,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx > ) > > _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL); > > + _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL); > + > _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, > NULL); > > for (i = 0; i < MAX_COMBINED_UNIFORM_BUFFERS; i++) { > @@ -864,6 +877,12 @@ _mesa_free_buffer_objects( struct gl_context > *ctx ) > > &ctx->UniformBufferBindings[i].BufferObject, > NULL); > } > + > + for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) { > + _mesa_reference_buffer_object(ctx, > + > &ctx->AtomicBufferBindings[i].BufferObject, > +NULL); > + } > } > > bool > @@ -1200,6 +1219,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint > *ids) > _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 ); > } > > + /* unbind Atomic Buffers binding points */ > + for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) { > + if (ctx->AtomicBufferBindings[j].BufferObject == > bufObj) { > + _mesa_BindBufferBase( GL_ATOMIC_COUNTER_BUFFER, j, > 0 ); > + } > + } > + > + if (ctx->AtomicBuffer == bufObj) { > +_mesa_BindBuffer(GL_ATOMIC_COUNTER_BUFFER, 0); > + } > + > /* unbind any pixel pack/unpack pointers bound to this > buffer */ > if (ctx->Pack.BufferObj == bufObj) { > _mesa_BindBuffer( GL_PIXEL_PACK_BUFFER_EXT, 0 ); > -- > 1.7.9.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >>> ___ > >>> mesa-dev mailing list > >>> mesa-dev@lists.freedesktop.org > >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > ___ > mesa-dev mailing list >
Re: [Mesa-dev] [PATCH] i965: Accelerate uploads of RGBA and BGRA GL_UNSIGNED_INT_8_8_8_8_REV textures
On Friday, July 18, 2014 06:25:25 PM Jason Ekstrand wrote: > Since intel is always going to be little-endian, > GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_BYTE for RGBA and BGRA I think you meant GL_UNSIGNED_BYTE. Reviewed-by: Kenneth Graunke > textures, so the same acceleration code will work. We might as well use > it. > --- > src/mesa/drivers/dri/i965/intel_tex_subimage.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > index c73cf10..875190f 100644 > --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c > +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c > @@ -560,7 +560,7 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, > * we need tests. > */ > if (!brw->has_llc || > - type != GL_UNSIGNED_BYTE || > + !(type == GL_UNSIGNED_BYTE || type == GL_UNSIGNED_INT_8_8_8_8_REV) || > texImage->TexObject->Target != GL_TEXTURE_2D || > pixels == NULL || > _mesa_is_bufferobj(packing->BufferObj) || > @@ -573,6 +573,10 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx, > packing->Invert) >return false; > > + if (type == GL_UNSIGNED_INT_8_8_8_8_REV && > + !(format == GL_RGBA || format == GL_BGRA)) > + return false; /* Invalid type/format combination */ > + > if ((texImage->TexFormat == MESA_FORMAT_L_UNORM8 && format == > GL_LUMINANCE) || > (texImage->TexFormat == MESA_FORMAT_A_UNORM8 && format == GL_ALPHA)) { >cpp = 1; > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glsl/glcpp: A bunch of pre-processor cleanups
On 2014-07-17 16:45:34, Jordan Justen wrote: > Made it ~25% through. :) I'll be busy for a bit, but I'll continue > looking at the rest later. > > 01/23 glsl/glcpp: Emit proper error for #define with a non-identifier > Reviewed-by: Jordan Justen > > 02/23 glsl/glcpp: Add support for comments between #define and macro > identifier > Reviewed-by: Jordan Justen > > 03/23 glsl/glcpp: Remove some un-needed calls to NEWLINE_CATCHUP > * Reference 6005e9cb in comment? > Reviewed-by: Jordan Justen > > 04/23 glsl/glcpp: Add testing for EOF sans newline (and fix for > , ) > Reviewed-by: Jordan Justen > > 05/23 glsl/glcpp: Drop extra, final newline from most output > * In the "\n {" section, you set > "parser->last_token_was_newline = 1;" > Doesn't "RETURN_TOKEN (NEWLINE);" do this as well? > Reviewed-by: Jordan Justen > > 06/23 glsl/glcpp: Abstract a bit of common code for returning string tokens > Reviewed-by: Jordan Justen 07/23 glsl/glcpp: Stop using a lexer start condition () for token skipping. Replied with question 08/23 glsl/glcpp: Minor tweak to wording of error message 09/23 glsl/glcpp: Fix off-by-one error in column in first-line error messages 10/23 glsl/glcpp: Add a -d/--debug option to the standalone glcpp program 11/23 glsl/glcpp: Don't use start-condition stack when switching to/from 12/23 glsl/glcpp: Rename HASH token to HASH_TOKEN Reviewed-by: Jordan Justen 13/23 glsl/glcpp: Correctly parse directives with intervening comments Commit message: simpoly => simply Not tab indented: "parser->first_non_space_token_this_line = 1;" Reviewed-by: Jordan Justen 14/23 glsl: Add an internal-error catch-all rule In your updated commit message (on git branch), you mention that this fixes two Khronos negative tests. But, it looks like we will now print an "Internal compiler error" error message. So, we'll fail to compile the negative tests, yet report that we have a compiler error? Does the 15 patch fix the error reported? If so, then maybe you can update the commit messages of the two patches? Also, if patch 15 fixes the error message, then add my Reviewed-by for this patch. Do we have a make check test for the errors that this fixes? 15/23 glsl: Properly lex extra tokens when handling # directives. Reviewed-by: Jordan Justen 16/23 glsl/glcpp: Drop the HASH_ prefix from token names like HASH_IF Commit message: "Note, that HASH_TOKEN instead of HASH": should this be something like "Note that for the same reason we use HASH_TOKEN instead of HASH"? Reviewed-by: Jordan Justen 17/23 glsl/glcpp: Add an explantory comment for "loc != NULL" check 18/23 glsl/glcpp: Emit error for duplicate parameter name in function-like macro 19/23 glsl/glcpp: Add (non)-support for ++ and -- operators 20/23 glsl/glcpp: Test that macro parameters substitute immediately after periods 21/23 glsl/glcpp: Add test for a multi-line comment within an #if 0 block Reviewed-by: Jordan Justen 22/23 glsl/glcpp: Add a catch-all rule for unexpected characters. Commit message: anyt => any Reviewed-by: Jordan Justen 23/23 glsl/glcpp: Treat carriage return as equivalent to line feed. Should we add this before the patch that will cause the error to be generated? Reviewed-by: Jordan Justen > On Thu, Jun 26, 2014 at 3:19 PM, Carl Worth wrote: > > Here's my latest series of patches to improve conformance of glcpp, (the > > glsl > > preprocessor in mesa). > > > > Most of these changes are fixes that only a test-suite author could love. > > Most > > fix nit-picky tests that do things that no sance application would actually > > do. They're all reasonable things to do, but few are likely to impact many > > real applications. > > > > The entire series (as well as some earlier patches already reviewed) can be > > found on the glcpp-fixup branch of my mesa tree: > > > > git://people.freedesktop.org/~cworth/mesa > > > > Here's a run-down of what the changes are in this series: > > > > Patch 01: Give an error for "#define 123" or similar non-identifier > > > > Not a useful thing to do, of course, but an error we need. > > > > Patch 02: Support comment here: "#define /* Ha! */ FOO" > > > > Patches 03-12: Many cleanups/rewriting while working on the next patch > > > > Patch 13: Support comment here: "# /* Tricky! */ define FOO" > > > > Comments appearing in these places are not likely, but are clearly > > valid according to the language specification. There was a bunch of > > work necessary to make this fix easy, (and even with all the > > preliminary work, the final patch was longer than I wanted). > > > > I am happy that the lexer state at the end of this cleanup is much > > simpler and easier to read than it was before. > > > > Patch 14: Emit internal error for unrecognized character > > > > This is to make un-subtle all classes of subtle bugs where the > > default > > flex rul
[Mesa-dev] [PATCHv5 0/6] Software rendering in EGL-DRM
This time for real - the final iteration of Giovanni's work on the topic of using dumb buffers for GEM-less kernel devices (qxl). Changes since last time: - Fix a couple of regressions that snuck in with megadrivers-gbm. - Make it work when GBM_ALWAYS_SOFTWARE is not set. - Create a separate driverAPI for kms_swrast, with different InitScreen hook. In the latter we explicitly call the kms_swrast winsys and swrast. - Removed PIPE_CAP_BUFFER_SHARE and sw_winsys::get_param. With the different InitScreen hooks can_share_buffers is explicitly set. - Small typos and commit msg fixes. The series has been confirmed working with gnome-continuous huge thanks to everyone who helped - Jasper and Colin :) If there are no objections I will be pushing these in around Sun-Mon. Last but not least thanks to Giovanni for his work, and keeping up with my naive nit-picking :) Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] st/gbm: don't segfault if the fail to create the screen
Whenever dd_create_screen/pipe_loader_* fails, gdrm->dev may be NULL. Thus peeking inside the struct will lead to a crash. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/gbm/gbm_drm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/gbm/gbm_drm.c b/src/gallium/state_trackers/gbm/gbm_drm.c index 077d518..95b07ef 100644 --- a/src/gallium/state_trackers/gbm/gbm_drm.c +++ b/src/gallium/state_trackers/gbm/gbm_drm.c @@ -292,7 +292,7 @@ gbm_gallium_drm_device_create(int fd) return &gdrm->base.base; out_no_screen: - debug_printf("failed to load driver: %s\n", gdrm->dev->driver_name); + debug_printf("failed to load gallium_gbm\n"); #if !GALLIUM_STATIC_TARGETS if (gdrm->dev) pipe_loader_release(&gdrm->dev, 1); -- 2.0.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] st/gbm: retrieve the driver-name via dd_driver_name()
... on static targets. Otherwise we'll crash badly as gdrm->dev is NULL when we try to copy the string driver_name. Signed-off-by: Emil Velikov --- src/gallium/state_trackers/gbm/gbm_drm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/state_trackers/gbm/gbm_drm.c b/src/gallium/state_trackers/gbm/gbm_drm.c index bfd48a0..077d518 100644 --- a/src/gallium/state_trackers/gbm/gbm_drm.c +++ b/src/gallium/state_trackers/gbm/gbm_drm.c @@ -282,7 +282,13 @@ gbm_gallium_drm_device_create(int fd) if (gdrm->screen == NULL) goto out_no_screen; +#if GALLIUM_STATIC_TARGETS + gdrm->base.driver_name = strdup(dd_driver_name()); +#else +#ifdef HAVE_PIPE_LOADER_DRM gdrm->base.driver_name = strdup(gdrm->dev->driver_name); +#endif /* HAVE_PIPE_LOADER_DRM */ +#endif return &gdrm->base.base; out_no_screen: -- 2.0.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] gallium: remove PIPE_CAP_BUFFER_SHARE cap and get_param sw_winsys hook
The kms_swrast driver has a separate InitScreen hook for its DriverAPI from the rest of the DRI2 drivers, all of which capable of buffer sharing. As such we no longer need to dive through the pipe-driver and winsys layers in order to determine if the driver can share buffers or not and we can explicitly set screen->can_share_buffer in InitScreen. XXX: Squash with the original commit ? Cc: Giovanni Campagna Signed-off-by: Emil Velikov --- src/gallium/docs/source/screen.rst| 4 src/gallium/drivers/freedreno/freedreno_screen.c | 1 - src/gallium/drivers/i915/i915_screen.c| 1 - src/gallium/drivers/ilo/ilo_screen.c | 2 -- src/gallium/drivers/llvmpipe/lp_screen.c | 7 --- src/gallium/drivers/nouveau/nv30/nv30_screen.c| 1 - src/gallium/drivers/nouveau/nv50/nv50_screen.c| 1 - src/gallium/drivers/nouveau/nvc0/nvc0_screen.c| 1 - src/gallium/drivers/r300/r300_screen.c| 1 - src/gallium/drivers/r600/r600_pipe.c | 1 - src/gallium/drivers/radeonsi/si_pipe.c| 1 - src/gallium/drivers/softpipe/sp_screen.c | 7 --- src/gallium/drivers/svga/svga_screen.c| 3 --- src/gallium/include/pipe/p_defines.h | 1 - src/gallium/include/state_tracker/sw_winsys.h | 5 - src/gallium/state_trackers/dri/dri2.c | 3 ++- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 14 -- 17 files changed, 2 insertions(+), 52 deletions(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index b09f18bd..ba583fe 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -213,10 +213,6 @@ The integer capabilities: * ``PIPE_CAP_DRAW_INDIRECT``: Whether the driver supports taking draw arguments { count, instance_count, start, index_bias } from a PIPE_BUFFER resource. See pipe_draw_info. -* ``PIPE_CAP_BUFFER_SHARE``: Whether it is possible to share buffers between - processes using the native window system. If this is 0, the buffers and - display targets available are only valid for in-process rendering and - scanout. This will be 1 for most HW drivers. .. _pipe_capf: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index 05426dc..c574cb8 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -175,7 +175,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: case PIPE_CAP_USER_CONSTANT_BUFFERS: case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT: - case PIPE_CAP_BUFFER_SHARE: return 1; case PIPE_CAP_SHADER_STENCIL_EXPORT: diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index 437f4bd..86a7a67 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -186,7 +186,6 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) case PIPE_CAP_USER_VERTEX_BUFFERS: case PIPE_CAP_USER_INDEX_BUFFERS: case PIPE_CAP_USER_CONSTANT_BUFFERS: - case PIPE_CAP_BUFFER_SHARE: return 1; /* Unsupported features (boolean caps). */ diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index 6b96e5b..e2a0e23 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -433,8 +433,6 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_TEXTURE_GATHER_OFFSETS: case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION: return 0; - case PIPE_CAP_BUFFER_SHARE: - return 1; default: return 0; diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index a7659c7..e25d14e 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -105,8 +105,6 @@ llvmpipe_get_name(struct pipe_screen *screen) static int llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) { - struct llvmpipe_screen *lp_screen = llvmpipe_screen(screen); - switch (param) { case PIPE_CAP_NPOT_TEXTURES: case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES: @@ -253,11 +251,6 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) return 0; case PIPE_CAP_FAKE_SW_MSAA: return 1; - case PIPE_CAP_BUFFER_SHARE: - if (lp_screen->winsys->get_param != NULL) - return lp_screen->winsys->get_param(lp_screen->winsys, param); - else - return 1; } /* should only get here on unhandled cases */ debug_printf("Unexpected PIPE_CAP %d query\n", param); diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c b/src/gallium/drivers/nouveau/nv30/nv30_screen.c index 41ccc10..32f5523 100644 --- a/src/gallium/driver
[Mesa-dev] [PATCH 4/6] gallium: Add a dumb drm/kms winsys backed swrast provider
From: Giovanni Campagna Add a new winsys and target that can be used with a dri2 state tracker and loader instead of drisw. This allows to use gbm as a dri2/image loader and avoid the extra copy from the backbuffer to the shadow frontbuffer. The new driver is called "kms_swrast", and is loaded by gbm as a fallback, because it is only useful with the gbm platform (as no buffer sharing is possible) To force select the driver set the environment variable GBM_ALWAYS_SOFTWARE [Emil Velikov] - Rebase on top of gallium megadriver. - s/text/test/ in configure.ac (Spotted by Andreas Pokorny). - Add scons support for winsys/sw/kms-dri and fix the build. - Provide separate DriverAPI, due to different InitScreen hook. Signed-off-by: Emil Velikov --- configure.ac | 5 + docs/relnotes/10.3.html| 2 + src/gallium/SConscript | 4 + .../auxiliary/target-helpers/inline_drm_helper.h | 32 +++ src/gallium/state_trackers/dri/dri2.c | 63 + src/gallium/state_trackers/dri/dri_screen.h| 3 + src/gallium/targets/dri/Makefile.am| 6 + src/gallium/targets/dri/SConscript | 4 + src/gallium/winsys/Makefile.am | 4 + src/gallium/winsys/sw/kms-dri/Makefile.am | 32 +++ src/gallium/winsys/sw/kms-dri/SConscript | 26 ++ src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 312 + src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.h | 37 +++ src/gbm/backends/dri/gbm_dri.c | 38 ++- 14 files changed, 564 insertions(+), 4 deletions(-) create mode 100644 src/gallium/winsys/sw/kms-dri/Makefile.am create mode 100644 src/gallium/winsys/sw/kms-dri/SConscript create mode 100644 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c create mode 100644 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.h diff --git a/configure.ac b/configure.ac index 744e55c..0fc2da5 100644 --- a/configure.ac +++ b/configure.ac @@ -1988,6 +1988,10 @@ if test -n "$with_gallium_drivers"; then if test "x$enable_dri" = xyes; then GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS dri/swrast" fi + +if test "x$have_libdrm" = xyes; then +GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS dri/kms-swrast" +fi ;; *) AC_MSG_ERROR([Unknown Gallium driver: $driver]) @@ -2224,6 +2228,7 @@ AC_CONFIG_FILES([Makefile src/gallium/winsys/svga/drm/Makefile src/gallium/winsys/sw/dri/Makefile src/gallium/winsys/sw/fbdev/Makefile + src/gallium/winsys/sw/kms-dri/Makefile src/gallium/winsys/sw/null/Makefile src/gallium/winsys/sw/wayland/Makefile src/gallium/winsys/sw/wrapper/Makefile diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html index 1f4e1da..9a74230 100644 --- a/docs/relnotes/10.3.html +++ b/docs/relnotes/10.3.html @@ -59,6 +59,8 @@ Note: some of the new features are only available with certain drivers. GL_ARB_fragment_layer_viewport on nv50, nvc0, llvmpipe, r600 GL_AMD_vertex_shader_viewport_index on i965/gen7+, r600 GL_ARB_clear_texture on i965 +A new software rasterizer driver (kms_swrast_dri.so) that works with +DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm) diff --git a/src/gallium/SConscript b/src/gallium/SConscript index 8d9849e..cb61720 100644 --- a/src/gallium/SConscript +++ b/src/gallium/SConscript @@ -71,6 +71,10 @@ if env['dri']: ]) SConscript([ +'winsys/sw/kms-dri/SConscript', +]) + +SConscript([ 'winsys/svga/drm/SConscript', ]) diff --git a/src/gallium/auxiliary/target-helpers/inline_drm_helper.h b/src/gallium/auxiliary/target-helpers/inline_drm_helper.h index 5656ef0..751ceb1 100644 --- a/src/gallium/auxiliary/target-helpers/inline_drm_helper.h +++ b/src/gallium/auxiliary/target-helpers/inline_drm_helper.h @@ -8,6 +8,11 @@ #include "dri_screen.h" #endif +#if GALLIUM_SOFTPIPE +#include "target-helpers/inline_sw_helper.h" +#include "sw/kms-dri/kms_dri_sw_winsys.h" +#endif + #if GALLIUM_I915 #include "i915/drm/i915_drm_public.h" #include "i915/i915_public.h" @@ -53,6 +58,33 @@ static char* driver_name = NULL; /* XXX: We need to teardown the winsys if *screen_create() fails. */ +#if defined(GALLIUM_SOFTPIPE) +#if defined(DRI_TARGET) + +const __DRIextension **__driDriverGetExtensions_kms_swrast(void); + +PUBLIC const __DRIextension **__driDriverGetExtensions_kms_swrast(void) +{ + globalDriverAPI = &dri_kms_driver_api; + return galliumdrm_driver_extensions; +} + +struct pipe_screen * +kms_swrast_create_screen(int fd) +{ + struct sw_winsys *sws; + struct pipe_screen *screen; + + sws = kms_dri_create_winsys(fd); + if (!sws) + return NULL; + + screen = sw_screen_create(sws); + return screen ?
[Mesa-dev] [PATCH 5/6] dri: Add a new capabilities for drivers that can't share buffers
From: Giovanni Campagna The kms-dri swrast driver cannot share buffers using the GEM, so it must tell the loader to disable extensions relying on that, without disabling the image DRI extension altogether (which would prevent the loader from working at all). This requires a new gallium capability (which is queried on the pipe_screen and for swrast drivers it's forwarded to the winsys), and requires a new version of the DRI image extension. [Emil Velikov] - Rebased on top of gallium-dri megadrivers. Signed-off-by: Emil Velikov --- include/GL/internal/dri_interface.h | 17 ++- src/egl/drivers/dri2/egl_dri2.c | 10 - src/egl/drivers/dri2/platform_drm.c | 14 ++-- src/gallium/docs/source/screen.rst| 4 src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c| 1 + src/gallium/drivers/ilo/ilo_screen.c | 2 ++ src/gallium/drivers/llvmpipe/lp_screen.c | 9 +++- src/gallium/drivers/nouveau/nv30/nv30_screen.c| 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c| 1 + src/gallium/drivers/r300/r300_screen.c| 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c| 1 + src/gallium/drivers/softpipe/sp_screen.c | 7 ++ src/gallium/drivers/svga/svga_screen.c| 3 +++ src/gallium/include/pipe/p_defines.h | 1 + src/gallium/include/state_tracker/sw_winsys.h | 5 + src/gallium/state_trackers/dri/dri2.c | 24 + src/gallium/state_trackers/dri/dri_screen.h | 1 + src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 26 --- 21 files changed, 119 insertions(+), 12 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 7c28c13..8c5ceb9 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -1005,7 +1005,7 @@ struct __DRIdri2ExtensionRec { * extensions. */ #define __DRI_IMAGE "DRI_IMAGE" -#define __DRI_IMAGE_VERSION 9 +#define __DRI_IMAGE_VERSION 10 /** * These formats correspond to the similarly named MESA_FORMAT_* @@ -1134,6 +1134,13 @@ enum __DRIChromaSiting { /*@}*/ /** + * \name Capabilities that might be returned by __DRIimageExtensionRec::getCapabilities + */ +/*@{*/ +#define __DRI_IMAGE_CAP_GLOBAL_NAMES 1 +/*@}*/ + +/** * blitImage flags */ @@ -1261,6 +1268,14 @@ struct __DRIimageExtensionRec { int dstx0, int dsty0, int dstwidth, int dstheight, int srcx0, int srcy0, int srcwidth, int srcheight, int flush_flag); + + /** +* Query for general capabilities of the driver that concern +* buffer sharing and image importing. +* +* \since 10 +*/ + int (*getCapabilities)(__DRIscreen *screen); }; diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index cc7531c..5602ec3 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -518,7 +518,15 @@ dri2_setup_screen(_EGLDisplay *disp) } if (dri2_dpy->image) { - disp->Extensions.MESA_drm_image = EGL_TRUE; + if (dri2_dpy->image->base.version >= 10 && + dri2_dpy->image->getCapabilities != NULL) { + int capabilities; + + capabilities = dri2_dpy->image->getCapabilities(dri2_dpy->dri_screen); + disp->Extensions.MESA_drm_image = (capabilities & __DRI_IMAGE_CAP_GLOBAL_NAMES) != 0; + } else + disp->Extensions.MESA_drm_image = EGL_TRUE; + disp->Extensions.KHR_image_base = EGL_TRUE; disp->Extensions.KHR_gl_renderbuffer_image = EGL_TRUE; if (dri2_dpy->image->base.version >= 5 && diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index 23a8d27..e272beb 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -685,8 +685,18 @@ dri2_initialize_drm(_EGLDriver *drv, _EGLDisplay *disp) disp->Extensions.EXT_buffer_age = EGL_TRUE; #ifdef HAVE_WAYLAND_PLATFORM - if (dri2_dpy->image) - disp->Extensions.WL_bind_wayland_display = EGL_TRUE; + if (dri2_dpy->image) { + if (dri2_dpy->image->base.version >= 10 && + dri2_dpy->image->getCapabilities != NULL) { + int capabilities; + + capabilities = + dri2_dpy->image->getCapabilities(dri2_dpy->dri_screen); + disp->Extensions.WL_bind_wayland_display = + (capabilities & __DRI_IMAGE_CAP_GLOBAL_NAMES) != 0; + } else + disp->Extensions.WL_bind_wayland_display = EGL_TRUE; + } #endif /* we're supporting EGL 1.4 */ diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index ba583fe..b09f18
[Mesa-dev] [PATCH 3/6] Add support for swrast to the DRM EGL platform
From: Giovanni Campagna Turn GBM into a swrast loader (providing putimage/getimage backed by a dumb KMS buffer). This allows to run KMS+DRM GL applications (such as weston or mutter-wayland) unmodified on cards that don't have any client side HW acceleration component but that can do modeset (examples include simpledrm and qxl) [Emil Velikov] - Fix make check. - Split dri_open_driver() from dri_load_driver(). - Don't try to bind the swrast extensions when using dri. - Handle swrast->CreateNewScreen() failure. - strdup the driver_name, as it's free'd at destruction. - s/LIBGL_ALWAYS_SOFTWARE/GBM_ALWAYS_SOFTWARE/ - Move gbm_dri_bo_map/unmap to gbm_driiint.h. - Correct swrast fallback logic. Signed-off-by: Emil Velikov --- src/egl/drivers/dri2/platform_drm.c | 153 +++ src/gbm/backends/dri/gbm_dri.c | 203 +++- src/gbm/backends/dri/gbm_driint.h | 57 +- src/gbm/gbm-symbols-check | 1 + src/gbm/main/gbm.h | 3 + 5 files changed, 369 insertions(+), 48 deletions(-) diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index 6227bc9..23a8d27 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -44,6 +44,7 @@ lock_front_buffer(struct gbm_surface *_surf) { struct gbm_dri_surface *surf = (struct gbm_dri_surface *) _surf; struct dri2_egl_surface *dri2_surf = surf->dri_private; + struct gbm_dri_device *device = (struct gbm_dri_device *) _surf->gbm; struct gbm_bo *bo; if (dri2_surf->current == NULL) { @@ -52,8 +53,11 @@ lock_front_buffer(struct gbm_surface *_surf) } bo = dri2_surf->current->bo; - dri2_surf->current->locked = 1; - dri2_surf->current = NULL; + + if (device->dri2) { + dri2_surf->current->locked = 1; + dri2_surf->current = NULL; + } return bo; } @@ -122,13 +126,22 @@ dri2_drm_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, goto cleanup_surf; } - dri2_surf->dri_drawable = - (*dri2_dpy->dri2->createNewDrawable) (dri2_dpy->dri_screen, - dri2_conf->dri_double_config, - dri2_surf->gbm_surf); + if (dri2_dpy->dri2) { + dri2_surf->dri_drawable = + (*dri2_dpy->dri2->createNewDrawable) (dri2_dpy->dri_screen, + dri2_conf->dri_double_config, + dri2_surf->gbm_surf); + + } else { + assert(dri2_dpy->swrast != NULL); + dri2_surf->dri_drawable = + (*dri2_dpy->swrast->createNewDrawable) (dri2_dpy->dri_screen, + dri2_conf->dri_double_config, + dri2_surf->gbm_surf); + } if (dri2_surf->dri_drawable == NULL) { - _eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable"); + _eglError(EGL_BAD_ALLOC, "createNewDrawable()"); goto cleanup_surf; } @@ -221,6 +234,28 @@ get_back_bo(struct dri2_egl_surface *dri2_surf) return 0; } +static int +get_swrast_front_bo(struct dri2_egl_surface *dri2_surf) +{ + struct dri2_egl_display *dri2_dpy = + dri2_egl_display(dri2_surf->base.Resource.Display); + struct gbm_dri_surface *surf = dri2_surf->gbm_surf; + + if (dri2_surf->current == NULL) { + assert(!dri2_surf->color_buffers[0].locked); + dri2_surf->current = &dri2_surf->color_buffers[0]; + } + + if (dri2_surf->current->bo == NULL) + dri2_surf->current->bo = gbm_bo_create(&dri2_dpy->gbm_dri->base.base, + surf->base.width, surf->base.height, + surf->base.format, surf->base.flags); + if (dri2_surf->current->bo == NULL) + return -1; + + return 0; +} + static void back_bo_to_dri_buffer(struct dri2_egl_surface *dri2_surf, __DRIbuffer *buffer) { @@ -374,19 +409,23 @@ dri2_drm_swap_buffers(_EGLDriver *drv, _EGLDisplay *disp, _EGLSurface *draw) struct dri2_egl_surface *dri2_surf = dri2_egl_surface(draw); int i; - if (dri2_surf->base.Type == EGL_WINDOW_BIT) { - if (dri2_surf->current) -_eglError(EGL_BAD_SURFACE, "dri2_swap_buffers"); - for (i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) - if (dri2_surf->color_buffers[i].age > 0) -dri2_surf->color_buffers[i].age++; - dri2_surf->current = dri2_surf->back; - dri2_surf->current->age = 1; - dri2_surf->back = NULL; - } + if (dri2_dpy->swrast) { + (*dri2_dpy->core->swapBuffers)(dri2_surf->dri_drawable); + } else { + if (dri2_surf->base.Type == EGL_WINDOW_BIT) { + if (dri2_surf->current) +_eglError(EGL_BAD_SURFACE, "dri2_swap_buffers"); + for (i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) +if (dri2_surf->color_buffers[i].age > 0) +
[Mesa-dev] [Bug 81693] New: scons build fails
https://bugs.freedesktop.org/show_bug.cgi?id=81693 Priority: medium Bug ID: 81693 Assignee: mesa-dev@lists.freedesktop.org Summary: scons build fails Severity: normal Classification: Unclassified OS: All Reporter: ja...@jlekstrand.net Hardware: Other Status: NEW Version: git Component: Mesa core Product: Mesa I believe this is as of c4067acd908322d79a4e08b9f4fffdd453c518ee but I have not done a full bisect. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: add ARB_clear_texture.xml to file list, remove duplicate decls
Signed-off-by: Ilia Mirkin --- Noticed while helping jekstrand debug his issues. This didn't help, but seems correct. src/mapi/glapi/gen/Makefile.am | 1 + src/mesa/main/teximage.h | 12 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index be7d9e0..212731f 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -112,6 +112,7 @@ API_XML = \ ARB_base_instance.xml \ ARB_blend_func_extended.xml \ ARB_clear_buffer_object.xml \ + ARB_clear_texture.xml \ ARB_color_buffer_float.xml \ ARB_compressed_texture_pixel_storage.xml \ ARB_compute_shader.xml \ diff --git a/src/mesa/main/teximage.h b/src/mesa/main/teximage.h index 984321c..42305f4 100644 --- a/src/mesa/main/teximage.h +++ b/src/mesa/main/teximage.h @@ -336,18 +336,6 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei samples, GLsizei height, GLsizei depth, GLboolean fixedsamplelocations); -extern void GLAPIENTRY -_mesa_ClearTexImage(GLuint texture, GLint level, -GLenum format, GLenum type, -const void *data); - -extern void GLAPIENTRY -_mesa_ClearTexSubImage(GLuint texture, GLint level, - GLint xoffset, GLint yoffset, GLint zoffset, - GLsizei width, GLsizei height, GLsizei depth, - GLenum format, GLenum type, - const void *data); - bool _mesa_compressed_texture_pixel_storage_error_check(struct gl_context *ctx, GLint dimensions, -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 81693] scons build fails
https://bugs.freedesktop.org/show_bug.cgi?id=81693 Jason Ekstrand changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #1 from Jason Ekstrand --- Had cruft lying around in my tree from an automake build -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] clover: Add checks for image support to the image functions
Most image functions are required to return a CL_INVALID_OPERATION error when used on devices without image support. --- src/gallium/state_trackers/clover/api/memory.cpp | 6 ++ src/gallium/state_trackers/clover/api/sampler.cpp | 3 +++ src/gallium/state_trackers/clover/api/transfer.cpp | 17 + src/gallium/state_trackers/clover/core/context.cpp | 9 + src/gallium/state_trackers/clover/core/context.hpp | 2 ++ 5 files changed, 37 insertions(+) diff --git a/src/gallium/state_trackers/clover/api/memory.cpp b/src/gallium/state_trackers/clover/api/memory.cpp index d26b1c6..77f8b96 100644 --- a/src/gallium/state_trackers/clover/api/memory.cpp +++ b/src/gallium/state_trackers/clover/api/memory.cpp @@ -106,6 +106,9 @@ clCreateImage2D(cl_context d_ctx, cl_mem_flags flags, void *host_ptr, cl_int *r_errcode) try { auto &ctx = obj(d_ctx); + if (!ctx.image_support()) + throw error(CL_INVALID_OPERATION); + if (flags & ~(CL_MEM_READ_WRITE | CL_MEM_WRITE_ONLY | CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR | CL_MEM_COPY_HOST_PTR)) @@ -141,6 +144,9 @@ clCreateImage3D(cl_context d_ctx, cl_mem_flags flags, void *host_ptr, cl_int *r_errcode) try { auto &ctx = obj(d_ctx); + if (!ctx.image_support()) + throw error(CL_INVALID_OPERATION); + if (flags & ~(CL_MEM_READ_WRITE | CL_MEM_WRITE_ONLY | CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR | CL_MEM_ALLOC_HOST_PTR | CL_MEM_COPY_HOST_PTR)) diff --git a/src/gallium/state_trackers/clover/api/sampler.cpp b/src/gallium/state_trackers/clover/api/sampler.cpp index 403892b..7f2e04d 100644 --- a/src/gallium/state_trackers/clover/api/sampler.cpp +++ b/src/gallium/state_trackers/clover/api/sampler.cpp @@ -31,6 +31,9 @@ clCreateSampler(cl_context d_ctx, cl_bool norm_mode, cl_int *r_errcode) try { auto &ctx = obj(d_ctx); + if (!ctx.image_support()) + throw error(CL_INVALID_OPERATION); + ret_error(r_errcode, CL_SUCCESS); return new sampler(ctx, norm_mode, addr_mode, filter_mode); diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp b/src/gallium/state_trackers/clover/api/transfer.cpp index 404ceb0..da12d2b 100644 --- a/src/gallium/state_trackers/clover/api/transfer.cpp +++ b/src/gallium/state_trackers/clover/api/transfer.cpp @@ -457,6 +457,8 @@ clEnqueueReadImage(cl_command_queue d_q, cl_mem d_mem, cl_bool blocking, auto src_origin = vector(p_origin); auto src_pitch = pitch(region, {{ img.pixel_size(), img.row_pitch(), img.slice_pitch() }}); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); validate_common(q, deps); validate_object(q, ptr, {}, dst_pitch, region); @@ -491,6 +493,9 @@ clEnqueueWriteImage(cl_command_queue d_q, cl_mem d_mem, cl_bool blocking, auto src_pitch = pitch(region, {{ img.pixel_size(), row_pitch, slice_pitch }}); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); + validate_common(q, deps); validate_object(q, img, dst_origin, region); validate_object(q, ptr, {}, src_pitch, region); @@ -522,6 +527,9 @@ clEnqueueCopyImage(cl_command_queue d_q, cl_mem d_src_mem, cl_mem d_dst_mem, auto dst_origin = vector(p_dst_origin); auto src_origin = vector(p_src_origin); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); + validate_common(q, deps); validate_object(q, dst_img, dst_origin, region); validate_object(q, src_img, src_origin, region); @@ -559,6 +567,9 @@ clEnqueueCopyImageToBuffer(cl_command_queue d_q, src_img.row_pitch(), src_img.slice_pitch() }}); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); + validate_common(q, deps); validate_object(q, dst_mem, dst_origin, dst_pitch, region); validate_object(q, src_img, src_origin, region); @@ -595,6 +606,9 @@ clEnqueueCopyBufferToImage(cl_command_queue d_q, vector_t src_origin = { src_offset }; auto src_pitch = pitch(region, {{ dst_img.pixel_size() }}); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); + validate_common(q, deps); validate_object(q, dst_img, dst_origin, region); validate_object(q, src_mem, src_origin, src_pitch, region); @@ -651,6 +665,9 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem, cl_bool blocking, auto region = vector(p_region); auto origin = vector(p_origin); + if (!q.device().image_support()) + throw error(CL_INVALID_OPERATION); + validate_common(q, deps); validate_object(q, img, origin, region); diff --git a/src/gallium/state_trackers/clover/core/context.cpp b/src/gallium/state_trackers/clover/core/context.cpp index bf4df39..722c97d 100644 --- a/src/galliu
[Mesa-dev] [PATCH 1/3] gallium: Add PIPE_CAP_COMPUTE_IMAGES_SUPPORTED
--- src/gallium/docs/source/screen.rst| 2 ++ src/gallium/drivers/radeon/r600_pipe_common.c | 7 +++ src/gallium/include/pipe/p_defines.h | 3 ++- 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index ba583fe..830a1a5 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -332,6 +332,8 @@ pipe_screen::get_compute_param. clock in MHz. Value type: ``uint32_t`` * ``PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS``: Maximum number of compute units Value type: ``uint32_t`` +* ``PIPE_COMPUTE_CAP_IMAGES_SUPPORTED``: Whether images are supported + non-zero means yes, zero means no. Value type: ``uint32_t`` .. _pipe_bind: diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 6535992..bf0585d 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -519,6 +519,13 @@ static int r600_get_compute_param(struct pipe_screen *screen, *max_compute_units = MAX2(rscreen->info.max_compute_units, 1); } return sizeof(uint32_t); + + case PIPE_COMPUTE_CAP_IMAGES_SUPPORTED: + if (ret) { + uint32_t *images_supported = ret; + *images_supported = 0; + } + return sizeof(uint32_t); } fprintf(stderr, "unknown PIPE_COMPUTE_CAP %d\n", param); diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index d9b6e5a..43bb1f5 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -650,7 +650,8 @@ enum pipe_compute_cap PIPE_COMPUTE_CAP_MAX_INPUT_SIZE, PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE, PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY, - PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS + PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS, + PIPE_COMPUTE_CAP_IMAGES_SUPPORTED }; /** -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] clover: Query the device to see if images are supported
--- src/gallium/state_trackers/clover/api/device.cpp | 2 +- src/gallium/state_trackers/clover/core/device.cpp | 6 ++ src/gallium/state_trackers/clover/core/device.hpp | 1 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index 97b2cf9..e825468 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -184,7 +184,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_IMAGE_SUPPORT: - buf.as_scalar() = CL_TRUE; + buf.as_scalar() = dev.image_support(); break; case CL_DEVICE_MAX_PARAMETER_SIZE: diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index b6078db..63aa193 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -169,6 +169,12 @@ device::max_compute_units() const { PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS)[0]; } +bool +device::image_support() const { + return get_compute_param(pipe, + PIPE_COMPUTE_CAP_IMAGES_SUPPORTED)[0]; +} + std::vector device::max_block_size() const { auto v = get_compute_param(pipe, PIPE_COMPUTE_CAP_MAX_BLOCK_SIZE); diff --git a/src/gallium/state_trackers/clover/core/device.hpp b/src/gallium/state_trackers/clover/core/device.hpp index 731c31e..2201700 100644 --- a/src/gallium/state_trackers/clover/core/device.hpp +++ b/src/gallium/state_trackers/clover/core/device.hpp @@ -63,6 +63,7 @@ namespace clover { cl_ulong max_mem_alloc_size() const; cl_uint max_clock_frequency() const; cl_uint max_compute_units() const; + bool image_support() const; std::vector max_block_size() const; std::string device_name() const; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add ARB_clear_texture.xml to file list, remove duplicate decls
On 24/07/14 02:16, Ilia Mirkin wrote: > Signed-off-by: Ilia Mirkin > --- > > Noticed while helping jekstrand debug his issues. This didn't help, but seems > correct. > The former will cause an issue when building without shared-glapi, while the latter... are just duplicated a few lines above in the same file. Reviewed-by: Emil Velikov > src/mapi/glapi/gen/Makefile.am | 1 + > src/mesa/main/teximage.h | 12 > 2 files changed, 1 insertion(+), 12 deletions(-) > > diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am > index be7d9e0..212731f 100644 > --- a/src/mapi/glapi/gen/Makefile.am > +++ b/src/mapi/glapi/gen/Makefile.am > @@ -112,6 +112,7 @@ API_XML = \ > ARB_base_instance.xml \ > ARB_blend_func_extended.xml \ > ARB_clear_buffer_object.xml \ > + ARB_clear_texture.xml \ > ARB_color_buffer_float.xml \ > ARB_compressed_texture_pixel_storage.xml \ > ARB_compute_shader.xml \ > diff --git a/src/mesa/main/teximage.h b/src/mesa/main/teximage.h > index 984321c..42305f4 100644 > --- a/src/mesa/main/teximage.h > +++ b/src/mesa/main/teximage.h > @@ -336,18 +336,6 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei > samples, >GLsizei height, GLsizei depth, >GLboolean fixedsamplelocations); > > -extern void GLAPIENTRY > -_mesa_ClearTexImage(GLuint texture, GLint level, > -GLenum format, GLenum type, > -const void *data); > - > -extern void GLAPIENTRY > -_mesa_ClearTexSubImage(GLuint texture, GLint level, > - GLint xoffset, GLint yoffset, GLint zoffset, > - GLsizei width, GLsizei height, GLsizei depth, > - GLenum format, GLenum type, > - const void *data); > - > bool > _mesa_compressed_texture_pixel_storage_error_check(struct gl_context *ctx, > GLint dimensions, > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] gallium: Add PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE
--- src/gallium/docs/source/screen.rst | 2 ++ src/gallium/include/pipe/p_defines.h | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 830a1a5..219c9f9 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -334,6 +334,8 @@ pipe_screen::get_compute_param. Value type: ``uint32_t`` * ``PIPE_COMPUTE_CAP_IMAGES_SUPPORTED``: Whether images are supported non-zero means yes, zero means no. Value type: ``uint32_t`` +* ``PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE``: The maximum size in bytes + of a constant buffer. Value type: ``uint64_t`` .. _pipe_bind: diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 43bb1f5..78709b9 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -651,7 +651,8 @@ enum pipe_compute_cap PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE, PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY, PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS, - PIPE_COMPUTE_CAP_IMAGES_SUPPORTED + PIPE_COMPUTE_CAP_IMAGES_SUPPORTED, + PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE }; /** -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] radeon/compute: Return a value for PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE
--- src/gallium/drivers/radeon/r600_pipe_common.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index bf0585d..2ea8f3d 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -492,6 +492,7 @@ static int r600_get_compute_param(struct pipe_screen *screen, } return sizeof(uint64_t); + case PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE: case PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE: if (ret) { uint64_t max_global_size; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] clover: Use correct query for CL_MAX_CONSTANT_BUFFER_SIZE
--- src/gallium/state_trackers/clover/core/device.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index 63aa193..ada5267 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -135,8 +135,8 @@ device::max_mem_input() const { cl_ulong device::max_const_buffer_size() const { - return pipe->get_shader_param(pipe, PIPE_SHADER_COMPUTE, - PIPE_SHADER_CAP_MAX_CONSTS) * 16; + return get_compute_param(pipe, + PIPE_COMPUTE_CAP_MAX_CONSTANT_BUFFER_SIZE)[0]; } cl_uint -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Add an accelerated version of F_TO_I for x86_64
On Wed, Jul 23, 2014 at 12:01 PM, Jason Ekstrand wrote: > According to a quick micro-benchmark, this new version is 20% faster on my > Haswell laptop. > > v2: Removed the XXX note about x86_64 from the comment > v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC > support for free. > > Signed-off-by: Jason Ekstrand > --- > src/mesa/main/imports.h | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h > index af780b2..6eb84ca 100644 > --- a/src/mesa/main/imports.h > +++ b/src/mesa/main/imports.h > @@ -274,10 +274,12 @@ static inline int IROUND_POS(float f) > return (int) (f + 0.5F); > } > > +#if defined(USE_X86_64_ASM) > +# include > +#endif > > /** > * Convert float to int using a fast method. The rounding mode may vary. > - * XXX We could use an x86-64/SSE2 version here. > */ > static inline int F_TO_I(float f) > { > @@ -292,6 +294,8 @@ static inline int F_TO_I(float f) > fistp r > } > return r; > +#elif defined(USE_X86_64_ASM) > + return _mm_cvt_ss2si(_mm_load_ss(&f)); > #else > return IROUND(f); > #endif > -- > 2.0.1 Reviewed-by: Matt Turner We could probably just do #ifdef __x86_64__ rather than depending on x86-64 assembly configure stuff. Change it if you want, otherwise I'm okay with letting people who build with assembly fix it up if they care. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] Add support for BPTC texture compression
On Wed, Jul 23, 2014 at 3:16 PM, Ian Romanick wrote: > On 07/22/2014 12:09 PM, Neil Roberts wrote: >> Here's a first attempt at a patch series to implement BPTC texture >> compression in the i965 driver on Gen>=7. >> >> Getting it to work on the hardware is pretty trivial as it's just a >> case of adding some new Mesa format enums and then plugging them >> together with the right Intel surface type. However GL requires that >> you are able to get the library to compress textures on the fly so we >> need a compressor too. I think for BPTC it doesn't really make much >> sense to actually use this because it takes a very long time to search >> the entire space and compress an image properly. For example the >> NVidia compressor takes in the order of an hour for a full-screen >> image. Instead I've just done the minimal work needed to get something >> that gives vaguely passable results. > > Is that NVIDIA's off-line compression tool, or is that the compressor in > the driver? A brute-force compressor will be very, very slow for BPTC. > There are other approaches that are much faster without sacrificing > very much quality. I suggested http://squish.paradice-insight.us/ which appears to be down now. It has a (MIT licensed) BC7 compressor and no free BC6H compressor. I suggested using it as an external dependency and contributing a BC6H compressor back. I think the github repo is https://github.com/Ethatron/squish-ccr I was thinking one of the cool things we might be able to do a ETC2 -> BC7 transcode on platforms without ETC2 hardware decompression. We won't be able to do that without a good compressor. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 54106] Fix a memory leak in dri2_terminate()
https://bugs.freedesktop.org/show_bug.cgi?id=54106 Tapani Pälli changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Tapani Pälli --- this bug got fixed: --- 8< --- commit 5cb1cad0aef8d1c426207c955996278290e19e60 Author: Emil Velikov Date: Mon Jun 2 12:26:17 2014 +0100 egl/dri2: do not leak dri2_dpy->driver_name -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/6] Add support for swrast to the DRM EGL platform
On Thu, 24 Jul 2014 01:43:35 +0100 Emil Velikov wrote: > From: Giovanni Campagna > > Turn GBM into a swrast loader (providing putimage/getimage backed > by a dumb KMS buffer). This allows to run KMS+DRM GL applications > (such as weston or mutter-wayland) unmodified on cards that don't > have any client side HW acceleration component but that can do > modeset (examples include simpledrm and qxl) > > [Emil Velikov] > - Fix make check. > - Split dri_open_driver() from dri_load_driver(). > - Don't try to bind the swrast extensions when using dri. > - Handle swrast->CreateNewScreen() failure. > - strdup the driver_name, as it's free'd at destruction. > - s/LIBGL_ALWAYS_SOFTWARE/GBM_ALWAYS_SOFTWARE/ > - Move gbm_dri_bo_map/unmap to gbm_driiint.h. > - Correct swrast fallback logic. > > Signed-off-by: Emil Velikov > --- > src/egl/drivers/dri2/platform_drm.c | 153 +++ > src/gbm/backends/dri/gbm_dri.c | 203 > +++- > src/gbm/backends/dri/gbm_driint.h | 57 +- > src/gbm/gbm-symbols-check | 1 + > src/gbm/main/gbm.h | 3 + > 5 files changed, 369 insertions(+), 48 deletions(-) > > diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c > index 347bc99..1aca506 100644 > --- a/src/gbm/backends/dri/gbm_dri.c > +++ b/src/gbm/backends/dri/gbm_dri.c > @@ -743,7 +886,7 @@ static struct gbm_device * > dri_device_create(int fd) > { > struct gbm_dri_device *dri; > - int ret; > + int ret, force_sw; > > dri = calloc(1, sizeof *dri); > if (!dri) > @@ -763,7 +906,15 @@ dri_device_create(int fd) > dri->base.type = GBM_DRM_DRIVER_TYPE_DRI; > dri->base.base.name = "drm"; > > - ret = dri_screen_create(dri); > + force_sw = getenv("GBM_ALWAYS_SOFTWARE") != NULL; > + if (!force_sw) { > + ret = dri_screen_create(dri); > + if (ret) > + ret = dri_screen_create_swrast(dri); > + } else { > + ret = dri_screen_create_swrast(dri); > + } > + > if (ret) >goto err_dri; Hi, is GBM_ALWAYS_SOFTWARE a new env var? Is it documented somewhere? How does it interact with EGL_SOFTWARE? Does GBM_ALWAYS_SOFTWARE affect GBM's ability to import dmabufs somehow, or only the surfaces that will be passed to EGL? (Importing dmabufs to be passed directly to KMS for scanout.) I'm terribly confused with all the *SOFTWARE* variables, and it seems I'm not alone as someone just recently filed a bunch of Weston bug reports while trying to get software GL rendering with LIBGL_ALWAYS_SOFTWARE on DRM/KMS. Thanks, pq ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 62202] Mesa demo dest window of wincopy flickers while toggle f/b buffer on software render
https://bugs.freedesktop.org/show_bug.cgi?id=62202 Tapani Pälli changed: What|Removed |Added Version|9.1 |git --- Comment #1 from Tapani Pälli --- updating status, nowadays swrast driver crashes with this demo --- 8< Drawing to GL_FRONT buffer Program received signal SIGSEGV, Segmentation fault. 0x76d70483 in blit_copy_pixels (ctx=0x77fd7010, srcx=0, srcy=0, width=300, height=300, dstx=0, dsty=0, type=6144) at state_tracker/st_cb_drawpixels.c:1414 warning: Source file is more recent than executable. 1414 drawY = rbDraw->Base.Height - drawY - drawH; Missing separate debuginfos, use: debuginfo-install expat-2.1.0-5.fc19.x86_64 libGLEW-1.9.0-3.fc19.x86_64 libXau-1.0.8-1.fc19.x86_64 libXdamage-1.1.4-3.fc19.x86_64 libXext-1.3.2-1.fc19.x86_64 libXfixes-5.0.1-1.fc19.x86_64 libXxf86vm-1.1.3-1.fc19.x86_64 libffi-3.0.13-4.fc19.x86_64 libgcc-4.8.3-1.fc19.x86_64 libstdc++-4.8.3-1.fc19.x86_64 libxcb-1.9-3.fc19.x86_64 llvm-libs-3.3-4.fc19.x86_64 mesa-libGLU-9.0.0-4.fc19.x86_64 zlib-1.2.7-10.fc19.x86_64 (gdb) bt #0 0x76d70483 in blit_copy_pixels (ctx=0x77fd7010, srcx=0, srcy=0, width=300, height=300, dstx=0, dsty=0, type=6144) at state_tracker/st_cb_drawpixels.c:1414 #1 0x76d708bb in st_CopyPixels (ctx=0x77fd7010, srcx=0, srcy=0, width=300, height=300, dstx=0, dsty=0, type=6144) at state_tracker/st_cb_drawpixels.c:1500 #2 0x76c11b4c in _mesa_CopyPixels (srcx=0, srcy=0, width=300, height=300, type=6144) at main/drawpix.c:269 #3 0x00401814 in Redraw () at wincopy.c:164 #4 0x00401905 in EventLoop () at wincopy.c:232 #5 0x004013cf in main (argc=, argv=) at wincopy.c:312 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev