jfb added a comment.

In D79279#2195299 <https://reviews.llvm.org/D79279#2195299>, @rjmccall wrote:

> Patch looks basically okay to me, although I'll second Richard's concern that 
> we shouldn't absent-mindedly start producing overloaded `memcpy`s for 
> ordinary `__builtin_memcpy`.

Yeah I think that's a leftover from the first patch. Should I drop it? At the 
same time, address spaces are currently accidentally "working", should I drop 
that in a patch before this change? Or leave as-is?



================
Comment at: clang/docs/LanguageExtensions.rst:2446
+order in which they occur (and in which they are observable) can only be
+guaranteed using appropriate fences around the function call. Element size must
+therefore be a lock-free size for the target architecture. It is a runtime
----------------
rjmccall wrote:
> "*The* element size must..."  But I would suggest using "access size" 
> consistently rather than "element size".
I'm being consistent with the naming for IR, which uses "element" as well. I'm 
not enamored with the naming, but wanted to point out the purposeful 
consistency to make sure you preferred "access size". Without precedent I would 
indeed prefer "access size", but have a slight preference for consistency here. 
This is extremely weakly held preference.

(I fixed "the").


================
Comment at: clang/docs/LanguageExtensions.rst:2454
+and might be non-uniform throughout the operation.
+
+The builtins can be used as building blocks for different facilities:
----------------
rsmith wrote:
> jfb wrote:
> > rsmith wrote:
> > > From the above description, I think the documentation is unclear what the 
> > > types `T` and `U` are used for. I think the answer is something like:
> > > 
> > > """
> > > The types `T` and `U` are required to be trivially-copyable types, and 
> > > `byte_element_size` (if specified) must be a multiple of the size of both 
> > > types. `dst` and `src` are expected to be suitably aligned for `T` and 
> > > `U` objects, respectively.
> > > """
> > > 
> > > But... we actually get the alignment information by analyzing pointer 
> > > argument rather than from the types, just like we do for `memcpy` and 
> > > `memmove`, so maybe the latter part is not right. (What did you intend 
> > > regarding alignment for the non-atomic case?) The trivial-copyability and 
> > > divisibility checks don't seem fundamentally important to the goal of the 
> > > builtin, so I wonder if we could actually just use `void` here and remove 
> > > the extra checks. (I don't really have strong views one way or the other 
> > > on this, except that we should either document what `T` and `U` are used 
> > > for or make the builtins not care about the pointee type beyond its 
> > > qualifiers.)
> > You're right. I've removed most treatment of `T` / `U`, and updated the 
> > documentation.
> > 
> > I left the trivial copy check, but `void*` is a usual escape hatch.
> > 
> > Divisibility is now only checked for `size` / `element_size`.
> Please document the trivial copy check.
Should I bubble this to the rest of the builtin in a follow-up patch? I know 
there are cases where that'll cause issues, but I worry that it would be a 
pretty noisy diagnostic (especially if we instead bubble it to C's `memcpy` 
instead).


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2639-2640
   case Builtin::BI__builtin_mempcpy: {
+    QualType DestTy = getPtrArgType(CGM, E, 0);
+    QualType SrcTy = getPtrArgType(CGM, E, 1);
     Address Dest = EmitPointerWithAlignment(E->getArg(0));
----------------
rsmith wrote:
> jfb wrote:
> > rsmith wrote:
> > > Looking through implicit conversions in `getPtrArgType` here will change 
> > > the code we generate for cases like:
> > > 
> > > ```
> > > void f(volatile void *p, volatile void *q) {
> > >   memcpy(p, q, 4);
> > > }
> > > ```
> > > 
> > > ... (in C, where we permit such implicit conversions) to use a volatile 
> > > memcpy intrinsic. Is that an intentional change?
> > I'm confused... what's the difference that this makes for the pre-existing 
> > builtins? My intent was to get the `QualType` unconditionally, but I can 
> > conditionalize it if needed... However this ought to make no difference:
> > ```
> > static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E,
> >                               unsigned ArgNo) {
> >   QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType();
> >   if (ArgTy->isArrayType())
> >     return CGM.getContext().getAsArrayType(ArgTy)->getElementType();
> >   if (ArgTy->isObjCObjectPointerType())
> >     return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType();
> >   return ArgTy->castAs<clang::PointerType>()->getPointeeType();
> > }
> > ```
> > and indeed I can't see the example you provided change in IR from one to 
> > the other. The issue I'm working around is that getting it unconditionally 
> > would make ObjC code sad when `id` is passed in as I outlined above.
> The example I gave should produce a non-volatile memcpy, and used to do so 
> (we passed `false` as the fourth parameter to `CreateMemCpy`). With this 
> patch, `getPtrArgType`will strip off the implicit conversion from `volatile 
> void*` to `void*` in the argument type, so `isVolatile` below will be `true`, 
> so I think it will now create a volatile memcpy for the same testcase. If 
> that's not what's happening, then I'd like to understand why not :)
> 
> I'm not saying that's necessarily a bad change, but it is a change, and it's 
> one we should make intentionally if we make it at all.
Oh yes, sorry I thought you were talking about something that `getPtrArgType` 
did implicitly! Indeed the C code behaves differently in that it doesn't just 
strip `volatile` anymore.

I'm not super thrilled by the default C behavior, and I think this new behavior 
removes a gotcha, and is in fact what I was going for in the first iteration of 
the patch. Now that I've separated the builtin I agree that it's a bit odd... 
but it seems like the right thing to do anyways? But it no longer matches the C 
library function to do so.

FWIW, this currently "works as you'd expect":
```
void f(__attribute__((address_space(32))) void *dst, const void *src, int sz) {
    __builtin_memcpy(dst, src, sz);
}
```
https://godbolt.org/z/dcWxcK

and I think that's completely accidental because the C library function doesn't 
(and, as John pointed out earlier, the builtin is meant to act like the C 
function).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79279/new/

https://reviews.llvm.org/D79279

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to