That's right. Except really what might have happend was occl query; write X; more drawing; write X+1;
and then on the CPU, you see X+1. So the tests are always for >= X. And if you have more than 2^32 submits, you cry, because I'm *sure* that nothing implements wraparound properly :) On Mon, Jul 6, 2015 at 1:45 PM, Vyacheslav Gonakhchyan <ytri...@gmail.com> wrote: > Ilia, thanks a lot for the info. > > So basically if I submit to GPU's command stream: > perform occlusion query, > write X to Y. > I know that query is completed when after reading Y address I get X. > > Regards, > Vyacheslav > > On Mon, Jul 6, 2015 at 9:13 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> >> I'm only really familiar with nouveau, but I think all GPU hardware >> works in roughly the same way. Basically you have some way of >> inserting "write X to address Y" into the command stream (aka a >> "fence"), after which you insert "write X+1 to address Y" and so on. >> If you want the CPU to wait on a given fence, you just do "while >> (*address < x);". If you have multiple GPU processing queues, you can >> usually also insert a "stall this queue until the value at address Y >> is at least X" command into the command stream. >> >> DRM uses implicit fences, so it knows which BOs are used for >> particular commands. So the flow goes something like "submit bunch of >> commands; submit fence write and attach that fence id to the BOs in >> the previous bunch of comands". Then to wait for a bo to become ready, >> you just wait until the GPU writes the appropriate number to memory >> address Y (from above). >> >> The mesa drivers can sometimes use clever tricks that avoid this >> sync'ing because it knows exactly how it emits the commands and >> perhaps it waits on something related earlier whereby it knows the >> other thing will be ready. No idea if that's the case here. >> >> Hope this helps, >> >> -ilia >> >> >> On Mon, Jul 6, 2015 at 1:05 PM, Vyacheslav Gonakhchyan >> <ytri...@gmail.com> wrote: >> > Ilia, thanks for the gallium link. >> > Do you know any links to high level info with broad strokes about how >> > this >> > sync works? Frankly I do not know driver terminology and wanted to know >> > more >> > about how this sync is performed for my research. I'm using mesa as a >> > reference because it has open implementation code. Occlusion query >> > functionality probably waits for z-buffer to become ready. Problem is >> > that >> > usual synchronization techniques do not apply here. I'm thinking that >> > driver >> > code gets notifications about state change. I want to know what kind of >> > notifications are available? Can query be performed in parallel with >> > another >> > frame being processed or does it need complete GPU pipeline flush? >> > >> > Thanks, >> > Vyacheslav >> > >> > On Mon, Jul 6, 2015 at 8:32 PM, Ilia Mirkin <imir...@alum.mit.edu> >> > wrote: >> >> >> >> On Mon, Jul 6, 2015 at 11:29 AM, Vyacheslav Gonakhchyan >> >> <ytri...@gmail.com> wrote: >> >> > Hi, everyone. >> >> > >> >> > Trying to understand method radeonQueryGetResult (more broadly >> >> > GPU-CPU >> >> > sync). >> >> > >> >> > static void radeonQueryGetResult(struct gl_context *ctx, struct >> >> > gl_query_object *q) >> >> > { >> >> > struct radeon_query_object *query = (struct radeon_query_object >> >> > *)q; >> >> > uint32_t *result; >> >> > int i; >> >> > >> >> > radeon_print(RADEON_STATE, RADEON_VERBOSE, >> >> > "%s: query id %d, result %d\n", >> >> > __func__, query->Base.Id, (int) query->Base.Result); >> >> > >> >> > radeon_bo_map(query->bo, GL_FALSE); >> >> > result = query->bo->ptr; >> >> > >> >> > query->Base.Result = 0; >> >> > for (i = 0; i < query->curr_offset/sizeof(uint32_t); ++i) { >> >> > query->Base.Result += LE32_TO_CPU(result[i]); >> >> > radeon_print(RADEON_STATE, RADEON_TRACE, "result[%d] = %d\n", >> >> > i, >> >> > LE32_TO_CPU(result[i])); >> >> > } >> >> > >> >> > radeon_bo_unmap(query->bo); >> >> > } >> >> > >> >> > I don't know which part is responsible for blocking behavior (waiting >> >> > for >> >> > response from GPU). I suspect that radeon_bo_map does this magic. >> >> > Can someone point in the right direction? >> >> >> >> The radeon_bo_map defined in >> >> src/gallium/winsys/radeon/drm/radeon_drm_bo.c indeed has this magic. >> >> However the code in src/mesa/drivers/dri/radeon/radeon_queryobj.c >> >> references the radeon_bo_map in libdrm, which does not appear to wait. >> >> >> >> FWIW for nouveau, nouveau_bo_map will also implicitly do a >> >> nouveau_bo_wait, but that does not appear to be the case for radeon. >> >> >> >> Cheers, >> >> >> >> -ilia >> > >> > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev