On Sunday, 21 October 2018 at 05:47:14 UTC, Manu wrote:
On Sat, Oct 20, 2018 at 10:10 AM Stanislav Blinov via
Digitalmars-d <digitalmars-d@puremagic.com> wrote:
Synchronized with what? You still have `a`, which isn't
`shared` and doesn't require any atomic access or
synchronization. At this point it doesn't matter if it's an
int or a struct. As soon as you share `a`, you can't just
pretend that reading or writing `a` is safe.
`b` can't read or write `a`... accessing `a` is absolutely safe.
It's not, with or without your proposal. The purpose of sharing
`a` into `b` is to allow someone to access `*a` in a threadsafe
way (but un-@safe, as it *will* require casting away `shared`
from `b`). That is what's making keeping an unshared reference
`a` un-@safe: whoever accesses `*a` in their @trusted
implementations via `*b` can't know that `*a` is being
(@safe-ly!) accessed in a non-threadsafe way at the same time.
Someone must do something unsafe to undermine your
threadsafety... and
if you write unsafe code and don't know what you're doing,
there's
nothing that can help you.
Ergo, it follows that anyone that is making an implicit cast from
mutable to shared better know what they're doing, which mere
mortal users (not "experts") might not. I.e. it's a way to giving
a loaded gun to someone who never held a weapon before.
Today, every interaction with shared is unsafe.
Nod.
Creating a safe interaction with shared will lead to people not
doing unsafe things at every step.
Triple nod.
Encapsulate it all you want, safety only remains a
contract of convention, the language can't enforce it.
You're talking about @trusted code again. You're fixated on
unsafe interactions... my proposal is about SAFE interactions.
I'm trying to obliterate unsafe interactions with shared.
I know... Manu, I *know* what you're trying to do. We (me, Atila,
Timon, Walter...) are not opposing your goals, we're pointing out
the weakest spot of your proposal, which, it would seem, would
require more changes to the language than just disallowing
reading/writing `shared` members.
module expertcode;
@safe:
struct FileHandle {
@safe:
void[] read(void[] storage) shared;
void[] write(const(void)[] buffer) shared;
}
FileHandle openFile(string path);
// only the owner can close
void closeFile(ref FileHandle);
void shareWithThreads(shared FileHandle*); // i.e. generate a
number of jobs in some queue
void waitForThreads(); // waits until all
processing is done
module usercode;
import expertcode;
void processHugeFile(string path) {
FileHandle file = openFile(path);
shareWithThreads(&file); // implicit cast
waitForThreads();
file.closeFile();
}
This is a very strange program...
Why? That's literally the purpose of being able to `share`: you
create/acquire a resource, share it, but keep a non-`shared`
reference to yourself. If that's not required, you'd just create
the data `shared` to begin with.
I'm dubious it is in fact "expertcode"... but let's look into
it.
You're fixating on it being file now. I give an abstract example,
you dismiss it as contrived, I give a concrete one, you want to
dismiss it as "strange".
Heh, replace 'FileHandle' with 'BackBuffer', 'openFile' with
'acquireBackBuffer', 'shareWithThreads' with
'generateDrawCommands', 'waitForThreads' with
'gatherCommandsAndDraw', 'closeFile' with 'postProcessAndPresent'
;)
File handle seems to have just 2 methods... and they are both
threadsafe. Open and Close are free-functions.
It doesn't matter if they're free functions or not. What matters
is signature: they're taking non-`shared` (i.e. 'owned')
reference. Methods are free functions in disguise.
Close does not promise threadsafety itself (but of course, it
doesn't violate read/write's promise, or the program is
invalid).
Yep, and that's the issue. It SHALL NOT violate threadsafety, but
it can't promise such in any way :(
I expect the only possible way to achieve this is by an
internal mutex to make sure read/write/close calls are
serialised.
With that particular interface, yes.
read and write will appropriately check their file-open state
each time they perform their actions.
Why? The only purpose of giving someone a `shared` reference is
to give a reference to an open file. `shared` references can't do
anything with the file but read and write, they would expect to
be able to do so.
What read/write do in the case of being called on a closed
file... anyones guess? I'm gonna say they do no-op... they
return a null pointer to indicate the error state.
Looking at the meat of the program; you open a file, and
distribute it to do accesses (I presume?)....
Naturally, this is a really weird thing to do, because even if
the API is threadsafe such that it doesn't crash and
reads/writes are
serialised, the sequencing of reads/writes will be random, so I
don't believe any sane person (let alone an expert) would write
this
program... but moving on.
Um, that's literally what std.stdio does, for writes at least,
except it doesn't advertise `File` as `shared`. That's how we get
interleaved, but not corrupted, output even when writing from
multiple threads. Now, that's not *universally* useful, but
nonetheless that's a valid use case.
Then you wait for them to finish, and close the file.
Fine. You have a file with randomly interleaved data... for
whatever reason.
Or I have command lists, or images loaded in background...
This program does appear to be safe (assuming that the
implementations aren't invalid), but a very strange program
nonetheless.
Remove the call to `waitForThreads()` (assume user just forgot
that, i.e. the "accident"). Nothing would change for the
compiler: all calls remain @safe.
Yup.
And yet, if we're lucky, we get
a consistent instacrash. If we're unlucky, we get memory
corruption, or an unsolicited write to another currently open
file, either of which can go unnoticed for some time.
Woah! Now this is way off-piste..
Why would get a crash? Why would get memory corruption? None of
those things make sense.
Because the whole reason to have `shared` is to avoid the
extraneous checks that you mentioned above, and only write actual
useful code (i.e. lock-write-unlock, or read-put-to-queue-repeat,
or whatever), not busy-work (testing if the file is open on every
call). If you have a `shared` reference, it better be to existing
data. If it isn't, the program is invalid already: you've shared
something that doesn't "exist" (good for marketing, not so good
for multithreading). That's why having `shared` and un-`shared`
references to the same data simultaneously is not safe: you can't
guarantee in any way that the owning thread doesn't invalidate
the data through it's non-`shared` reference while you're doing
your threadsafe `shared` work; you can only "promise" that by
convention (documentation).
So, you call closeFile immediately and read/write start
returning null.
And I have partially-read or partially-written data. Or Maybe I
call closeFile(), main thread continues and opens another file,
which gives the same file descriptor, `shared` references to
FileHandle which the user forgot to wait on continue to work
oblivious to the fact that it's a different file now. It's a
horrible, but still @safe, implementation of FileHandle, yes, but
the caller (user) doesn't know that, and can't know that just
from the interface. The only advice against that is "don't do
that", but that's irrespective of your proposal.
I'm going to assume that `shareWithThreads()` was implemented
by an
'expert' who checked the function results for errors. It was
detected that the reads/write failed, and an error "failed to
read file" was emit, then the function returned promptly.
The uncertainty of what happens in this program is however
`shareWithThreads()` handles read/write emitting an error.
But you can only find out about these errors in `waitForThreads`,
the very call that the user "forgot" to make!
Of course the program becomes invalid if you do that, there's
no question about it, this goes for all buggy code.
In this case, I wouldn't say the program becomes 'invalid'; it
is
valid for filesystem functions to return error states and you
should handle them.
In this case, read/write must return some "file not open"
state, and it should be handled properly.
This problem has nothing to do with threadsafety. It's a logic
issue related to threading, but that's got nothing to do with
this.
There's no question about it, it *is* a logic error. The point
is, it's a logic error that ultimately can lead to UB despite
being @safe. Just like this is:
https://issues.dlang.org/show_bug.cgi?id=19316.
The problem is,
definition of "valid" lies beyond the type system: it's an
agreement between different parts of code, i.e. between expert
programmers who wrote FileHandle et al., and users who write
processHugeFile(). The main issue is that certain *runtime*
conditions can still violate @safe-ty.
Perhaps you don't understand what @safe-ty means? It's a
compiler assertion that the code is memory-safe. It's not a
magic attribute that tells you that your program is right.
I know.
Runtime conditions being in a valid state is a high-level
problem for the program, and doesn't interacts with
threadsafety in any
fundamental way, and not in any way that @safe has anything to
do with.
Yep.
You're just describing normal high-level multi-threading logic
problems. `shared` does not and can not help you with that; you
need to look to libraries that offer threading support
frameworks for that.
It can help you not write code that does invalid access to
memory and crash. That's the extent of its charter.
I understand that. So... it would seem that your proposal focuses
more on @safe than on threadsafety?
If a `shared` API is designed well, it can also offer strong
implicit advice about how to correctly interact with API's. The
compiler will coerce you to do the right things with error
messages.
Your proposal makes the language more strict wrt. to writing
@safe 'expertmodule', thanks to disallowing reads and writes
through `shared`, which is great.
However the implicit conversion to `shared` doesn't in any way
improve the situation as far as user code is concerned, unless
I'm still missing something.
It does, it eliminates unsafe user interactions. It must be
that way to be safe. There were no casts above, it's great! And
your program is safe!
(although it's wrong)
It's @safe, but it's wrong because it's not threadsafe. Yay! :D
FWIW, I doubt anybody in their right mind would attempt to
write a threadsafe filesystem API this way.
std.stdio ;) (yes, I know there's no `shared` there, but that's
what it does).
Any such API would be structured COMPLETELY differently; it
would likely have one `shared` method that would accept
requests for deferred fulfillment, and handle unique objects
associated with each request.
Perhaps. How would the user know that?