yaxunl added a comment.

In https://reviews.llvm.org/D44985#1050876, @rjmccall wrote:

> In https://reviews.llvm.org/D44985#1050674, @yaxunl wrote:
>
> > In https://reviews.llvm.org/D44985#1050670, @rjmccall wrote:
> >
> > > What exactly are you trying to express here?  Are you just trying to make 
> > > these external declarations when compiling for the device because 
> > > `__shared__` variables are actually defined on the host?  That should be 
> > > handled by the frontend by setting up the AST so that these declarations 
> > > are not definitions.
> >
> >
> > No. These variables are not like external symbols defined on the host. They 
> > behave like global variables in the kernel code but never initialized. 
> > Currently no targets are able to initialize them and it is users' 
> > responsibility to initialize them explicitly.
> >
> > Giving them an initial value will cause error in some backends since they 
> > cannot handle them, therefore put undef as initializer.
>
>
> So undef is being used as a special marker to the backends that it's okay not 
> to try to initialize these variables?


I think undef as the initializer tells the llvm passes and backend that this 
global variable contains undefined value. I am not sure if this is better than 
without an initializer. I saw code in CodeGenModule::getOrCreateStaticVarDecl

  // Local address space cannot have an initializer.
  llvm::Constant *Init = nullptr;
  if (Ty.getAddressSpace() != LangAS::opencl_local)
    Init = EmitNullConstant(Ty);
  else
    Init = llvm::UndefValue::get(LTy);

which means OpenCL static variable in local address space (equivalent to CUDA 
shared address space) gets an undef initializer.

For CUDA shared variable, in CodeGenFunction::EmitStaticVarDecl, it first goes 
through call of CodeGenModule::getOrCreateStaticVarDecl and gets a 
zeroinitializer, then it reaches line 400

  // Whatever initializer such variable may have when it gets here is
    // a no-op and should not be emitted.
    bool isCudaSharedVar = getLangOpts().CUDA && getLangOpts().CUDAIsDevice &&
                           D.hasAttr<CUDASharedAttr>();
    // If this value has an initializer, emit it.
    if (D.getInit() && !isCudaSharedVar)
      var = AddInitializerToStaticVarDecl(D, var);

Although this disables adding initializer from D, var already has a 
zeroinitializer from CodeGenModule::getOrCreateStaticVarDecl, therefore its 
initializer needs to be overwritten by undef.

Probably a better solution would be do it in  
CodeGenModule::getOrCreateStaticVarDecl, side by side by the OpenCL code.


https://reviews.llvm.org/D44985



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to