Jan Vesely <jan.ves...@rutgers.edu> writes: > On Sat, 2014-08-16 at 13:13 +0300, Francisco Jerez wrote: >> Jan Vesely <jan.ves...@rutgers.edu> writes: >> >> > On Thu, 2014-08-07 at 16:02 +0300, Francisco Jerez wrote: >> >> Jan Vesely <jan.ves...@rutgers.edu> writes: >> >> >> >> > This respin includes Francisco's approach of providing implicit >> >> > in the arg vector passed from clover, and Tom's idea of appending >> >> > implicit args after the kernel args. >> >> > >> >> >> >> Hmmm... Maybe it would make sense to add some sort of versioning >> >> (e.g. as part of the target triple) to the binary interface between >> >> clover and the kernel instead, so we can handle this sort of >> >> non-backwards compatible changes and the compiler back-end and libclc >> >> have some way to find out whether some specific feature is available and >> >> e.g. some specific extension should be enabled. >> >> >> >> > I assumed it's not safe to modify exec.input, so the input vector is >> >> > copied >> >> > before appending work dim. >> >> > >> >> >> >> Why wouldn't it be safe? You just need to make sure they're appended >> >> before the compute state is created. >> > >> > I thought there might be a problem when called from multiple threads, >> > but it looks like most of the vars are local to the current call anyway. >> > >> > I looked at the code a bit better, and need a bit of help with what the >> > proffered approach would be. >> > >> > exec_context::bind() appends all kernel args to the input vector. If the >> > implicit args are added before bind() it shifts all other args, which is >> > not what we want. >> > if the implicit args are appended after, they are not accounted for in >> > shader->input_size (and not copied by the driver). >> > >> > my current code modifies exec_context::bind() to preserve the content of >> > input before binding kernel args, and append the old content after the >> > args are bound. >> > I have also considered passing and implicit args vector to >> > exec_context::bind to make the trick more visible. >> > >> > Turning workdim into a proper arg in _args does not work either, because >> > it is not present in module args. >> > >> > any thoughts? >> > >> >> I finally had a chance to take a closer look at your series. It looks >> like you're right: In order to implement my proposal cleanly, implicit >> arguments would have to be part of the _args array so the compiler would >> have to include them in the module argument lists with memory layout >> parameters (e.g. alignment, size) suitable for the hardware, so there's >> probably little benefit compared to your original approach that includes >> the number of dimensions as an additional launch_grid() parameter. >> >> So we don't have to change it again, can you add another array parameter >> for the base grid offset? That's another thing we don't pass through >> the pipe driver API currently and CL requires. The prototype of >> launch_grid could look like: >> >> | void (*launch_grid)(struct pipe_context *context, uint dims, >> | const uint *block_layout, const uint *grid_layout, >> | const uint *grid_offset, uint32_t pc, >> | const void *input); >> >> And don't forget to update the docs. :) > > Hi, > > I wanted to explore the original idea of appending implicit args, since > launch_grid is driver specific and would need to reimplement the same > functionality in every driver. > > I came up with a solution (see the attached patch). I don't like that > the implicit arg needs to be set in api function.
Right, if we do it this way it would probably be a better fit for the clover::kernel code, the CL front-end doesn't really need to be aware of implicit args. > I also don't like that this way there is no difference between > explicit and implicit kernel arguments. On the other hand it's simple, > and does not need additional per driver code. > Yeah... We definitely want to hide these from the user, as e.g. the CL_KERNEL_NUM_ARGS param is required by the spec to return the number of arguments provided by the user, and we don't want the user to set implicit args, so it gets a bit messy. I think I like better your original idea of passing them as launch_grid() arguments, even though the grid offset and dimension parameters are somewhat artificial from a the hardware's point of view. > thanks, > jan > >> >> Thank you. >> >> > thanks, >> > jan >> > >> > >> >> >> >> > Passes get-work-dim piglit on turks without any regression, >> >> > I have not tested SI as I don't have the hw. >> >> > >> >> > jan >> >> > >> >> > >> >> > >> >> > >> >> > Jan Vesely (3): >> >> > gallium: Pass input data size to launch_grid >> >> > clover: Add work dimension implicit param to input >> >> > r600,radeonsi: Copy implicit args provided by clover >> >> > >> >> > src/gallium/drivers/ilo/ilo_gpgpu.c | 2 +- >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_compute.c | 2 +- >> >> > src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 4 +- >> >> > src/gallium/drivers/nouveau/nvc0/nve4_compute.c | 2 +- >> >> > src/gallium/drivers/r600/evergreen_compute.c | 14 +- >> >> > src/gallium/drivers/r600/evergreen_compute.h | 1 - >> >> > src/gallium/drivers/radeonsi/si_compute.c | 6 +- >> >> > src/gallium/include/pipe/p_context.h | 2 +- >> >> > src/gallium/state_trackers/clover/core/kernel.cpp | 162 >> >> > ++++++++++++---------- >> >> > src/gallium/tests/trivial/compute.c | 40 +++--- >> >> > 10 files changed, 122 insertions(+), 113 deletions(-) >> >> > >> >> > -- >> >> > 1.9.3 >> > >> > -- >> > Jan Vesely <jan.ves...@rutgers.edu> > > -- > Jan Vesely <jan.ves...@rutgers.edu> > From 7ad338ebd3a67b19d4ba492fb5a4cbda418fcdad Mon Sep 17 00:00:00 2001 > From: Jan Vesely <jan.ves...@rutgers.edu> > Date: Mon, 1 Sep 2014 19:18:12 -0400 > Subject: [PATCH RFC 1/1] clover: Append implicit work dim arg > > Signed-off-by: Jan Vesely <jan.ves...@rutgers.edu> > --- > src/gallium/state_trackers/clover/api/kernel.cpp | 6 ++++++ > .../state_trackers/clover/llvm/invocation.cpp | 20 > ++++++++++++++------ > 2 files changed, 20 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/kernel.cpp > b/src/gallium/state_trackers/clover/api/kernel.cpp > index 05cc392..a3b9735 100644 > --- a/src/gallium/state_trackers/clover/api/kernel.cpp > +++ b/src/gallium/state_trackers/clover/api/kernel.cpp > @@ -276,6 +276,9 @@ clEnqueueNDRangeKernel(cl_command_queue d_q, cl_kernel > d_kern, > auto block_size = validate_block_size(q, kern, dims, > d_grid_size, d_block_size); > > + cl_uint work_dim_val = block_size.size(); > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val); > + > validate_common(q, kern, deps); > > auto hev = create<hard_event>( > @@ -299,6 +302,9 @@ clEnqueueTask(cl_command_queue d_q, cl_kernel d_kern, > auto &kern = obj(d_kern); > auto deps = objs<wait_list_tag>(d_deps, num_deps); > > + cl_uint work_dim_val = 1; > + kern.args().back().set(sizeof(work_dim_val), &work_dim_val); > + > validate_common(q, kern, deps); > > auto hev = create<hard_event>( > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 7bca0d6..a934384 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -315,17 +315,17 @@ namespace { > kernel_func = kernels[i]; > kernel_name = kernel_func->getName(); > > - for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), > - E = kernel_func->arg_end(); I != E; > ++I) { > - llvm::Argument &arg = *I; > #if HAVE_LLVM < 0x0302 > - llvm::TargetData TD(kernel_func->getParent()); > + llvm::TargetData TD(kernel_func->getParent()); > #elif HAVE_LLVM < 0x0305 > - llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); > + llvm::DataLayout TD(kernel_func->getParent()->getDataLayout()); > #else > - llvm::DataLayout TD(mod); > + llvm::DataLayout TD(mod); > #endif > > + for (llvm::Function::arg_iterator I = kernel_func->arg_begin(), > + E = kernel_func->arg_end(); I != E; > ++I) { > + llvm::Argument &arg = *I; > llvm::Type *arg_type = arg.getType(); > const unsigned arg_store_size = TD.getTypeStoreSize(arg_type); > > @@ -384,6 +384,14 @@ namespace { > } > } > > + // Implicit arguments > + // Work dimensions (cl_uint), uint is 32 bit > + llvm::Type *target_type = llvm::Type::getInt32Ty(mod->getContext()); > + args.push_back(module::argument(module::argument::scalar, > + sizeof(cl_uint), TD.getTypeStoreSize(target_type), > + TD.getABITypeAlignment(target_type), > + module::argument::zero_ext)); > + > m.syms.push_back(module::symbol(kernel_name, 0, i, args )); > } > > -- > 1.9.3
pgpy_mccouRjp.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev