On 9/17/18 5:34 PM, Jason Ekstrand wrote:
On Mon, Sep 17, 2018 at 8:34 AM Danylo Piliaiev
<[email protected] <mailto:[email protected]>> wrote:
Hi Jason,
I have implemented the extension and it works, however before
sending the patch I decided to see how it can interact with other
extension - VK_EXT_conditional_render
and got confused:
From the spec it is not disallowed to call functions of
VK_KHR_draw_indirect_count in conditional rendering block. So
let's say that predicate of conditional rendering
will result in FALSE, we call vkCmdDrawIndirectCountKHR which sees
that there is already a predicate emitted and it should be taken
into account, since it will be FALSE
all next predicates should result in FALSE. The issue is that I
don't see an easy way to do this.
My current implementation uses the next predicate (it is same as
in GL implementation):
/* While draw_index < maxDrawCount the predicate's result
will be
* (draw_index == maxDrawCount) ^ TRUE = TRUE
* When draw_index == maxDrawCount the result is
* (TRUE) ^ TRUE = FALSE
* After this all results will be:
* (FALSE) ^ FALSE = FALSE
*/
anv_batch_emit(&cmd_buffer->batch, GENX(MI_PREDICATE), mip) {
mip.LoadOperation = LOAD_LOAD;
mip.CombineOperation = COMBINE_XOR;
mip.CompareOperation = COMPARE_SRCS_EQUAL;
}
But if the initial predicate state is FALSE then when draw_index
equals maxDrawCount the result will be
(FALSE) ^ TRUE = TRUE
Which isn't something we want. But without "not equal" operation
or without MI_MATH I don't see how to fix this.
First off, thanks for looking into the combination of these two
features. Getting them to work together nicely is half of the
difficulty of these two extensions.
On platforms which support MI_MATH, I think we're probably better off
just using it. For Ivy Bridge, the only thing I could think to do
when both are in use would be to do two MI_PREDICATEs for every draw
call. The first would be what you describe above and the second would
be the MI_PREDICATE for the conditional render with COMBINE_AND. When
the condition is true, the AND would have no effect and you would get
the behavior above. If the condition is false, the above logic for
implementing draw_indirect_count wouldn't matter because it would get
ANDed with false. On Haswell and later, it's likely more efficient to
just use MI_MATH and avoid re-loading the draw count and condition on
every draw call. (We could just leave the draw count in CS_GPR0, for
instance.) Does that work?
Looks like a plan. I'll try to go this path.
Also there is another interaction which wasn't thought of before:
Several vkCmdDrawIndirectCountKHR in conditional render block but using
MI_MATH should solve it.
Since you're already looking at it, it may be best to implement the
two extensions together as one patch series so we can be sure we have
the interactions right. If we can't get them to play nicely together,
we may have to disable one of them on Ivy Bridge and I'd rather not
enable an extension and then take the functionality away later.
I agree, the extensions are too interweaved which I realized when
implemented the most basic version of EXT_conditional_render. I'll also
make sure to test all of these on Ivy Bridge.
I don't see anything related in Vulkan or GL specs neither I see
anything in Piglit and CTS tests.
Maybe I'm missing something obvious, could you help me here?
There's nothing preventing the two from being used together. If we
don't have piglit tests that exercise the GL versions together, that
would be bad. Have you found good Vulkan CTS tests for either of
those two extensions? VK_KHR_indirect_count should have tests since
it's a KHR extension but we may need to write the tests for
EXT_conditional_render.
There are no tests of how these features work together in Piglit or
Vulkan CTS. Also my previous observations are true for GL so it also
ought to be fixed (I'll write a test for Piglit first to confirm this).
There are tests for VK_KHR_indirect_count in Vulkan CTS which my current
implementation passes. There aren't any tests for EXT_conditional_render
however I used an example from https://github.com/SaschaWillems/Vulkan
to test my initial implementation of it.
Should tests for EXT_conditional_render go into Vulkan CTS?
Also since the scope of the work grew quite a lot and I'll be soon on
vacation the implementation won't be ready until at least second week of
October (just making sure no one will think I ran away scared =) )
--Jason
You can find current implementation in
https://gitlab.freedesktop.org/GL/mesa/commit/9d1c7ae0db618c6f7281d5f667c96612ff0bb2c2
- Danil
On 9/12/18 6:30 PM, Danylo Piliaiev wrote:
Hi,
Thank you for the directions!
On 9/12/18 6:13 PM, Jason Ekstrand wrote:
Danylo,
You're free to implement anything not already implemented. Here
are some other (probably simpler) extensions that I think can be
reasonably implemented on Intel HW:
- VK_EXT_conservative_rasterization
- VK_EXT_conditional_render
Didn't see them, will take closer look later.
As far as VK_KHR_draw_indirect_count go, I haven't implemented
it yet because the "proper" implementation is actually kind-of
painful though not impossible. In general, there are two ways
it can be done:
## 1. The cheap and easy way
The spec explicitly allows for the cheap and easy way by
requiring the caller to pass in a maxDrawCount. The idea here
would be to emit maxDrawCount draw calls only have each one of
them predicated on draw_id < draw_count_from_buffer. This one
probably wouldn't take much to wire up but it does mean doing
maxDrawCount 3DPRIMITIVE commands no matter how many of them are
actually needed.
I saw such implementation for i965, looked straightforward and I
thought it will easily translate into Vulkan implementation.
Didn't know that it's possible to do it other way on Intel.
## 2. The hard but maybe more correct way
The Intel command streamer does have the ability, if used
carefully, to loop. The difficulty here isn't in looping; that
can be done fairly easily on gen8+ by emitting a predicated
MI_BATCH_BUFFER_START that's predicated off of the looping
condition which jumps to the top of the loop. The real
difficult bit is taking your loop counter and using it to
indirectly access the array of draw information. In order to do
this, you have to have a self-modifying batch buffer. In short,
you would emit MI commands which read the draw information into
registers and also emit MI commands (which would probably come
before the first set) which write the actual address into the
location in the batch where the first set of MI commands has
their address to read from. This would be a painful to debug
mess of GPU hangs but could actually be kind-of fun to implement.
The correct way looks interesting, I'll need some time to
understand details.
I hope I haven't scarred you away from working on anv; I just
wanted to make it clear what you're getting yourself into. Both
ways are totally implementable and I think you'd pretty much
have to do the first method on gen7 if we really care about
supporting it there. The second is totally doable, it'll just
involve some headaches when it's broken. If you want to
continue with this project after reading my scarry e-mail, I
recommend starting with method 1 to get your feet wet and then
we can look into method 2 once you have that working.
I'll follow your recommendation and will start from the first method.
- Danil
--Jason
On Wed, Sep 12, 2018 at 6:36 AM Danylo Piliaiev
<[email protected] <mailto:[email protected]>>
wrote:
Hello everyone,
I would like to try to implement one of the Vulkan extensions -
VK_KHR_draw_indirect_count for anv,
unless someone is already working on it.
It's a relatively minor extension and I saw that the same
functionality
is already implemented
for ARB_indirect_parameters in i965.
Also I would appreciate any tips if there are any known
possible tricky
parts.
- Danil
_______________________________________________
mesa-dev mailing list
[email protected]
<mailto:[email protected]>
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev