On 27/12/2018 11:56, H. Nikolaus Schaller wrote:
Hi David,
Am 27.12.2018 um 12:15 schrieb David Chisnall <gnus...@theravensnest.org>:
On 26/12/2018 16:08, Patryk Laurent wrote:
Hi David,
a language (which is somewhat dated even in its latest incarnation).
I would love to know your thoughts with respect to that point, if you'd care to
share (off list if you'd prefer). Or might you have a talk/article you could
point me to?
A programming language is intended as a compromise between the abstract
algorithms in the programmer's head and the concrete hardware on which it runs.
Ideally, it should be easy to map in both directions: from the human, to the
programming language, and then to the target hardware. Roughly speaking,
high-level language is one that optimises for the human-to-programming-language
translation, a low-level language is one that optimises for the
programming-language-to-machine translation.
Objective-C is a good late '80s programming language. It has an abstract
machine that is very close to late '80s hardware: flat memory, a single
processor.
Really? First of all even the good old 680x0 did have a coprocessor concept and
multitasking support. AFAIR it was even multiprocessor capable but I am not
sure if there were machines built with several processors.
The hardware supported parallelism, that doesn't mean that the language did.
And ObjC has NSThread and there are Distributed Objects which are very old but
powerful concepts. If DO is used properly you have a single set of
communicating objects which can be spread over multiple processors, machines,
even distant internet nodes...
Unfortunately Apple almost abandoned the concept and did not take care much
about binary compatibility and an open protocol specification (GNUstep DO are
not compatible to OS X DO). Therefore it is rarely used.
[P]DO was a good attempt, but people have mostly learned form its
mistakes. DO has synchronous and asynchronous messaging. Synchronous
messaging works reasonably well, but reentrancy with asynchronous
messaging is really difficult to implement without introducing either
deadlocks or data races.
NSThread lets you have multiple threads but the language does absolutely
nothing to make them easy to use, with the possible exception of
@synchronized, which is a stupid idea added for Java compatibility.
There's nothing in Objective-C, for example, for associating objects
with a particular thread to ensure that they aren't accidentally
aliased. Or for guaranteeing that an object graph is deeply immutable
and therefore safe to share between threads.
This isn't the world that we're currently living in. Computers have
multiple, heterogeneous, processors. My ancient phone (Moto G, first
generation) has four ARM cores, some GPU cores, and a bunch of specialised
accelerators. It has a fairly simple memory hierarchy, but my laptop and
desktop both have 3 layers of caches between the CPU cores and main memory and
have private DRAM attached to the GPU.
A modern language has to expose an abstract machine that's similar to this.
Hm. What do you mean with this? IMHO a modern (high level) language should not
expose but hide all this heterogenity and 3 layers of cache if they exist in
physical HW.
Usually you don't program for four ARM cores and a specific number of GPU cores
and accelerators. You rather expect that your high-level (application) program
(game, physics simulator, CAD, word processor, web browser, video editor, ...)
is compiled in a way that it magically makes best use of what is available.
Of course, ObjC is not ideal for implementing system level libraries or even a
kernel.
If you want to take advantage of homogeneous multicore, then you need a
language that lets you describe the amount of parallelism inherent in
your workload and then will combine them for you. Objective-C makes it
very hard to do any more than have worker threads in the background that
don't share any state with the rest of the program. Libdispatch lets
you put together asynchronous pipelines, but nontrivial ones need very
careful design to make sure that they don't accidentally mutate the same
objects at multiple stages. Again, there's nothing in the type system
that describes sharing.
A good language also has to remove mechanical work from the programmer.
Objective-C does some nice things with reflection here: for example, you need a
lot less boilerplate in Objective-C to wire up a GUI to its controller than in
Java.
Objective-C more or less gives you memory safety if you avoid the C subset of
the language (though that's pretty hard). Unfortunately, even if you avoid C
in your own code, all non-trivial Objective-C programs link to a load of
complex (and, therefore, buggy) C libraries. They have no protection from these
libraries: a single pointer bug in the C code can violate all of the invariants
that the Objective-C runtime depends on (as you can see from a lot of previous
posts in this list).
Modern Objective-C, with ARC, at least gives you temporal memory safety, though
it also gives you memory leaks if you have cyclic data structures and don't
explicitly break memory cycles. Classes such as NSArray give you spatial
memory safety if you use them instead of C arrays (and don't call methods like
-data). With Objective-C++, you can use lower-overhead things like std::string
and std::vector for primitive types and get memory safety if you use .at()
instead of operator[], but it's somewhat clunky (memory safety is possible, it
isn't the easiest option).
C++ has evolved a lot in the last 7 years. With std::shared_ptr and std::unique_ptr,
you get the same level of memory safety as ARC, with similar overheads. ARC
integrates nicely with Objective-C++, so you can put Objective-C object pointers into
C++ structs safely (including, for example, having a std::vector<id>).
Objective-C and C++ have very different strengths: Objective-C provides high-level
abstractions for late binding, C++ provides tight coupling for low-level compile-time
specialised data structures.
If you have to write Objective-C now, I'd recommend Objective-C++ with ARC as
the default base language. It's no surprise that this was Microsoft's choice
for WinObjC and apparently Apple also uses Objective-C++ extensively in their
own frameworks. GNUstep is somewhat crippled by using neither ARC nor
Objective-C++ internally. Both significantly improve developer productivity.
The three big challenges in language design for modern requirements are:
- Concurrency (including heterogeneous multiprocessing)
- Error handling
- Safe isolation (sandboxing / compartmentalisation)
Be suspicious of any 'new' language that doesn't have a good story for all of
these. If you can't express the idea of a graph of objects that the GPU now
has exclusive access to, then your language isn't suitable for modern hardware.
Why should the programmer (or a modern high-level language) have to care that a thing called
"GPU" exists besides a "CPU" in modern harware? It they have there seems to be
something wrong with the abstraction level.
They don't have to care that a GPU exists, but if they *want* to run
code on a GPU then they shouldn't have to fight their language to do so.
A GPU has very different characteristics to a CPU: it is optimised for
streaming data processing that has little or no data-dependent flow
control and is highly parallel, whereas the CPU is optimised for
workloads that have little parallelism, high locality of reference and
frequent branches. They have separate memory. Compiling code that is
structured for a GPU to run on a GPU is trivial, but the data management
is much harder. If your language can't express the idea of an object
graph that is being handed off to another part of the program for
exclusive use, then it also can't express data sharing with decoupled
accelerators.
David
_______________________________________________
Discuss-gnustep mailing list
Discuss-gnustep@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnustep