Re: [swift-evolution] [swift-evolution-announce] [Review] SE-0107: UnsafeRawPointer API

Andrew Trick via swift-evolution Tue, 05 Jul 2016 12:06:54 -0700

> On Jul 2, 2016, at 10:10 PM, Brent Royal-Gordon via swift-evolution 
> <swift-evolution@swift.org> wrote:
> 
> More concrete issues:
> 
> * Is there a reason there's a `load` that takes a byte offset, but not a 
> `storeRaw`?


A couple of those methods were added by request, but I was reluctant to add 
more methods just because they looked like they should be there.
But since you mention it, if we decide keep the purely additive 
`load(fromByteOffset:as)` method, then I’ll also add this for symmetry:

  func storeRaw<T>(_: T, toByteOffset: Int)

Should it be "toByteOffset" or "atByteOffset”?

https://github.com/apple/swift-evolution/pull/410/files

> * I'm also a little nervous about the fact that `storeRaw` (and `load`?) is 
> documented to only work properly on "trivial types", but it doesn't have any 
> sort of constraints to ensure it's used correctly. (One could imagine, for 
> instance, the compiler automatically conforming trivial types to a `Trivial` 
> protocol.)

Noted. There's absolutely no way to enforce that any overwritten value is 
trivial, but a sanitizer could catch it. When we have a "trivial" protocol we 
can add a debugPrecondition on the destination type.

`load` simply does not need to take a trivial type. Although it reads from raw 
memory, it is not a "raw" operation. It knows how to retain things.

I'd rather not introduce a symmetric `store` that handles nontrivial types, 
until we really need it, because of the serious potential for misuse. People 
should try to use typed pointers for assignment semantics.

There's a discussion on this in the proposal now:
https://github.com/apple/swift-evolution/blob/master/proposals/0107-unsaferawpointer.md#raw-memory-access

> * I don't think I understand `initialize(toContiguous:atIndex:with:)`. Does 
> it return a typed pointer to the whole buffer, or just the one instance it 
> initialized? In the `stringFromBytes` example, shouldn't we either subscript 
> the typed pointer from the previous `initialize(_:with:count:)` call, or call 
> `storeRaw(toContiguous:atIndex:with:)`, rather than initializing memory 
> twice? If this isn't a good use case for 
> `initialize(toContiguous:atIndex:with:)`, what would be?

The latest proposal has this example, which actually ignores the returned value:

  let rawBuffer = UnsafeMutableRawPointer.allocate(bytes: size + 1)
  rawBuffer.initialize(UInt8.self, with: value, count: size)
  rawBuffer.initialize(toContiguous: UInt8.self, atIndex: size, with: 0)

This was requested as a convenience. As mentioned in a previous email, I'm 
happy to drop it for now.

`initialize(toContiguous:with:count:)` returns a typed pointer to all the 
initialized elements.

Subscripting the typed pointer to write the null terminator would be wrong 
because that memory has never been bound to a type.

I do agree that the example should just be:

  let rawBuffer = UnsafeMutableRawPointer.allocate(bytes: size + 1)
  rawBuffer.initialize(UInt8.self, with: value, count: size)
  rawBuffer.initialize(contiguous: UInt8.self, at: size, to: 0)

But the easy, common way to initialize a C string will simply be:

  let cstr = UnsafeMutablePointer<CChar>.allocate(capacity: size + 1)
  // The whole string is now bound to CChar
  for i in 0..<size { cstr[i] = … }
  cstr[size] = 0

> I'm quite concerned by the "moveInitialize should be more elegant" section at 
> the bottom.
> 
> Since the types are so close, `moveInitialize` could require mutating 
> arguments and actually swap the pointers. For instance:
> 
>       func grow(buffer: UnsafePointer<Int>, count: Int, toNewCapacity 
> capacity: Int) -> UnsafeBuffer<Int> {
>               var buffer = buffer
>               var uninitializedBuffer = UnsafeRawPointer.allocate(capacity: 
> capacity, of: Int.self)
>               
>               uninitializedBuffer.swapPointersAfterMoving(from: &buffer, 
> count: count)
>               // `buffer` now points to the new allocation, filled in with 
> the Ints.
>               // `uninitializedBuffer` now points to the old allocation, 
> which is deinitialized.
>               
>               uninitializedBuffer.deallocate()
>               return buffer
>       }
> 
> This is *such* a strange semantic, however, that I'm not at all sure how to 
> name this function.
> 
> `moveAssign(from:count:)` could do something much simpler, returning a raw 
> version of `from`:
> 
>       target.moveAssign(from: source).deallocate()
> 
> `move()`, on the other hand, I don't see a good way to fix like this.
> 
> One ridiculous thing we could do for `moveAssign(from:count:)` and perhaps 
> `move()` is to deliberately make `self` invalid by setting it to address 0. 
> If it were `Optional`, this would nil `self`. If it weren't...well, something 
> would probably fail eventually.

For a while, I was trying to force a convention where deinitialization always 
returned a raw pointer because it's safer to initialize that
raw pointer. With the latest proposal I'm not as concerned about that. The 
majority of the time, it will be fine to reinitialize using the typed pointer. 
If the user wants a raw pointer back after the move, it is trivial just to cast 
the typed pointer into a raw pointer.

So, while move-semantics would be cool, I really don’t think it’s necessary or 
even desired in this case. Clear doc comments should be sufficient.

> * * *
> 
> I notice that many APIs require type parameters merely to force the user to 
> explicitly state the types involved. I wonder if we could instead introduce 
> an attribute which you could place on a parameter or return type indicating 
> that there must be an explicit `as` cast specifying its type:
> 
>       func storeRaw<T>(_: @explicit T)
>       func load<T>() -> @explicit T
>       func cast<T>() -> @explicit UnsafePointer<T>
>       
>       rawPointer.storeRaw(3 as Int)
>       rawPointer.load() as Int
>       rawPointer.cast() as UnsafePointer<Int>
> 
> This would also be useful on `unsafeBitCast`, and on user APIs which are 
> prone to type inference issues.

I'm not as irrated by explicit type arguments as some, but the feeling I get is 
that we really want a language feature that forces certain
generic paramters to be explicit. When that happens, we'll likely phase out the 
old-style type arguments in favor of angle brackets, and
I'll be sad because I dislike angle brackets.

> * * * 
> 
> In the long run, however, I wonder if we might end up removing 
> `UnsafeRawPointer`. If `Never` becomes a subtype-of-all-types, then 
> `UnsafePointer<Never>` would gain the basic properties of an 
> `UnsafeRawPointer`:
> 
> * Because `Never` is a subtype of all types, `UnsafePointer<Never>` could 
> alias any other pointer.
> 
> * Accessing `pointee` would be inherently invalid (it would either take or 
> return a `Never`), and APIs which initialize or set `pointee` would be 
> inherently uncallable.
> 
> * `Never` has no intrinsic size, so it could be treated as having a one-byte 
> size, allowing APIs which normally allocate, deallocate, or do pointer 
> arithmetic by instance size to automatically do so by byte size instead.
> 
> * APIs for casting an `UnsafePointer<T>` to `UnsafePointer<supertype of T>` 
> or `<subtype of T>` would do the right thing with `UnsafePointer<Never>`.
> 
> Thus, I could imagine `Unsafe[Mutable]RawPointer` becoming 
> `Unsafe[Mutable]Pointer<Never>` in the future, with some APIs being 
> generalized and moving to all `UnsafePointer`s while others are in extensions 
> on `UnsafePointer where Pointee == Never`.
> 
> It might be worth taking a look at the current API designs and thinking about 
> how they would look in that world:
> 
> * Is `nsStringPtr.casting(to: UnsafePointer<NSObject>)` how you would want to 
> write a pointee upcast? How about `UnsafePointer<NSString>(nsObjectPtr)` for 
> a pointee downcast?
> 
> * Would you want `initialize<T>(_: T.Type, with: T, count: Int = 1) -> 
> UnsafeMutablePointer<T>` in the `Never` extension, or (with a 
> supertype-of-Pointee constraint on `T`) would it be something you'd put on 
> other `UnsafeMutablePointer`s too? What does that mean for 
> `UnsafeMutablePointer.initialize(with:)`?
> 
> * Are `load` or `storeRaw` things that might make sense on any 
> `UnsafeMutablePointer` if they were constrained to supertypes only?
> 
> * Are there APIs which are basically the same on `Unsafe[Mutable]Pointer`s 
> and their `Raw` equivalents, except that the `Raw` versions are "dumb" 
> because they don't know what type they're operating on? If so, should they be 
> given the same name?

Very early on I considered  a special `Never` element type for all of the 
excellent reasons that you laid out (nice job explaining that), but the pointer 
conversion rules that we want are not implementable.

Since then, the proposal has evolved so much that it makes sense to have a 
nominal raw pointer type. The pointer type itself is distinctly different, not 
just the element type, and the type system needs to be aware of that. It's also 
critical that the raw and typed pointers have a distinct API. Moving both of 
their functionality into extensions would just be a workaround. In reality, 
since the semantics are different, there's almost no shared implementation.

In short, raw pointers are deliberately a different types and we want 
developers and APIs to be cognizant of that.

-Andy

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [swift-evolution-announce] [Review] SE-0107: UnsafeRawPointer API

Reply via email to