Re: [swift-evolution] [Concurrency] async/await + actors

Benjamin Spratling via swift-evolution Sun, 20 Aug 2017 11:01:54 -0700

Howdy,

It’s good to have a formal, language supported way to manage these aspects of 
concurrency.


In particular, I like that it establishes the asynchronously provided values 
into the surrounding scope, like guard.  This helps me write a clean sequence 
of calls with access to values which ‘skip’ a call.  Often I need to make a few 
asynchronous calls which fetch disparate value types from disparate endpoints 
or frameworks and then provide them all together into some final API.

I also like that the actor model formally wraps up access to shared resources.  
It seems using these together, the compiler would be able to do some 
fundamental reasoning about whether deadlocks could occur (or preventing them 
entirely without expensive run-time mechanisms) and also determine if certain 
code paths were forgotten.  One question I have is “how do we assign queues or 
priorities to the async methods?”  I had been toying with the idea of declaring 
a reference to the queue on which async methods run at their definition of the 
method, but this actor model groups around the data which which needs 
coordination instead, which is possibly better.

I haven’t yet read the philosophy links, so if I’m repeating ideas, or these 
ideas are moot in light of something, I guess ignore me, and I’ll just feel 
slightly embarrassed later.

However, at the UI level we often don’t even use the GCD methods because they 
do not directly support cancelation and progress reporting, so we use 
NSOperation and NSOperationQueue instead.  To me, adding a language feature 
which gets ignored when I build any full-featured app won’t be worth the time.  
Whatever is done here, we need to be cognizant that someone will implement 
another layer on top which does provide progress reporting and cancellation and 
make sure there’s a clean point of entry, much like writing the Integer 
protocols didn’t implement arbitrary-sized integers, but gave various 
implementations a common interface.

I know that data parallelism is out of scope, but GPU’s have been mentioned.  
When I write code which processes large amounts of data (whether that’s 
scientific data analysis or exporting audio or video tracks), it invariably 
uses many different hardware resources, files, GPU’s and others.  The biggest 
unsolved system-level problem I see (besides the inter-process APIs mentioned) 
is a way to effectively “pull” data through the system, instead of the current 
“push”-oriented API’s.  With push, we send tasks to queues.  Perhaps a 
resource, like reading data from a 1000 files, is slower than the later stage, 
like using the GPU to perform optimal feature detection in each file.  So my 
code runs fine when executed.  However, later I add just a slightly slower GPU 
task, now my files fill up memory faster than my GPU’s drain it, and instead of 
everything running fine, my app exhausts memory and the entire process crashes. 
 Sure I can create a semaphore to read only the “next" file into memory at a 
time, but I suppose that’s my point.  Instead of getting to focus on my task of 
analyzing several GB’s of data, I’m spending time worrying about creating a 
pull asynchronous architecture.  I don’t know whether formal “pull” could be in 
scope for this next phase, but let’s learn from the problem of the deprecated 
“current queue” function in GCD which created a fundamental impossibility of 
writing run-time safe “dispatch_sync” calls, and provide at least the hooks 
into the system-detected available compute resources.  (If “get_current_queue” 
had provided a list of the stack of the queues, it would have been usable.)

Together with the preceding topic is the idea of cancellation of long-running 
processes.  Maybe that’s because the user needs to board the train and doesn’t 
want to continue draining power while exporting a large video document they 
could export later.  Or maybe it’s because the processing for this frame of the 
live stream of whatever is taking too long and is going to get dropped.  But 
here we are again, expressing dependencies and cancellation, like the high 
level frameworks.

I’m glad someone brought up NSURLSessionDataTask, because it provides a more 
realistic window into the kinds of features we’re currently using.  If we don’t 
design a system which improves this use, then I question whether we’ve made 
real progress.

NSBlockOperation was all but useless to me, since it didn’t provide a reference 
to ‘self’ when its block got called, I had to write a subclass which did so 
that it could ask itself if it had been cancelled to implement cancellation of 
long-running tasks. NSOperation also lacks a clean way to pass data from one 
block to those dependent on it.  So there’s lots of subclasses to write to 
handle that in a generic way.  I feel like block-based undo methods understood 
this when they added the separate references to weakly-held targets.  That 
enabled me to use the framework methods out of the box.

So my point is not that these problems need to be solved at the language level, 
but that we aren’t spending a large amount of time designing and implementing a 
glorious system which will be used inside some wrapper functions like an 
UnsafeMutableRawPointer (is that the right name, I can’t ever remember.), and 
then become useless once anyone starts building any real-world app, UI or 
server-side.  UnsafeMutableRawPointer has its place, but for the vast majority 
of app developers, they’ll likely never use it, and thus it doesn’t deserve 
years of refining.


-Ben Spratling

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Concurrency] async/await + actors

Reply via email to