Re: Gc collects trigger access violation exception
On Saturday, 11 February 2017 at 03:21:29 UTC, mashomee wrote: Hi, everyone. I have a thread running in Dll. But it always halts because of access violation exception when GC begins collecting. [...] It's great that you're found a workaround. If you weren't doing something funky to trigger this, please file a bug report at http://issues.dlang.org
Re: usage of ref foreach with variadic functions fails with "cannot be ref"
On Sunday, 12 February 2017 at 03:34:19 UTC, Michael Coulombe wrote: Do you have a complete code example that gives your error? I can't reproduce it (DMD v2.073.0): int foo(T...)(T vars) { int i = 0; foreach(ref v ; vars) { v = 5; i += v; } return i; } void bar(T...)(ref T vars) { foreach(ref v ; vars) { v = 3; } } void main() { import std.stdio; int x = 7; bar(x); writeln(foo(4,x,8.2)); // 15, no errors } Oh... That's really strange; I've updated my code to use non indexed foreach, and it's now building without the error... I have absolutely no idea what caused the error to occur. I've also tried reverting to earlier versions of the project, and I can't recreate the issue. If it occurs again, I'll snapshot the project and try to make a repeatable set up.
Re: Mallocator and 'shared'
On Saturday, 11 February 2017 at 04:32:37 UTC, Michael Coulombe wrote: On Friday, 10 February 2017 at 23:57:18 UTC, bitwise wrote: [...] A shared method means that it can only be called on a shared instance of the struct/class, which will have shared fields. A shared method should be logically thread-safe, but that cannot be guaranteed by the compiler. A non-shared method can touch shared memory, and thus should be thread-safe if it does, but can only be called on a non-shared instance with possibly non-shared fields. shared/non-shared methods don't mix because you generally need to use different, less-efficient instructions and algorithms to be thread-safe and scalable in a shared method. In the case of Mallocator, there are no fields so as far as I can tell the attribute doesn't do much except for documentation and for storing references to it in other shared structs/objects. Thanks for the explanation, but I'm still confused. It seems like you're saying that 'shared' should mean both 'thread safe' and 'not thread safe' depending on context, which doesn't make sense. Example: shared A a; struct A { int x, y; void foo() shared { a.x = 1; } } int main(string[] argv) { a.x = 5; a.y = 5; a.foo(); return 0; } Qualifying 'a' with 'shared' means that it's shared between threads, which means that accessing it is _not_ thread safe. Since the method 'foo' accesses 'a', 'foo' is also _not_ thread safe. Given that both the data and the method are 'shared', a caller should know that race conditions are possible and that they should aquire a lock before accessing either of them...or so it would seem. I imagine that qualifying a method with 'shared' should mean that it can access shared data, and hence, is _not_ thread safe. This prevent access to 'shared' data from any non 'shared' context, without some kind of bridge/cast that a programmer would use when they knew that they had aquired the lock or ownership of the data. Although this is what would make sense to me, it doesn't seem to match with the current implementation of 'shared', or what you're saying. It seems that methods qualified with 'shared' may be what you're suggesting matches up with the 'bridge' I'm trying to describe, but again, using the word 'shared' to mean both 'thread safe' and 'not thread safe' doesn't make sense. Firstly, because the same keyword should not mean two strictly opposite things. Also, if a 'shared' method is supposed to be thread safe, then the fact that it has access to shared data is irrelevant to the caller. So 'shared' as a method qualifier doesn't really make sense. What would make more sense is to have a region where 'shared' data could be accessed - Maybe something like this: struct S { shared int x; Lock lk; private void addNum(int n) shared { x += num; } int add(int a, int b) { shared { lk.lock(); addNum(a); addNum(b); lk.unlock(); } } } So above, 1) 'x' would be shared, and mutating it would not thread safe. 2) 'addNum' would have access to 'shared' data, and also be non-thread-safe 3) 'x' and 'addNum' would be inaccessible from 'add' since they're 'shared' 4) a 'shared' block inside 'add' would allow access to 'x' or 'addNum', with the responsibility being on the programmer to lock. 5) alternatively, 'shared' data could be accessed from within a 'synchronized' block. I thought 'shared' was a finished feature, but it's starting to seem like it's a WIP. This kind of feature seems like it has great potential, but is mostly useless in it's current state. After more testing with shared, it seems that 'shared' data is mutable from many contexts, from which it would be unsafe to mutate it without locking first, which basically removes any gauruntee that would make 'shared' useful. Again, tell me if I'm wrong here, but there seems to be a lot of holes in 'shared'. Thanks
Re: Mallocator and 'shared'
On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: It seems that methods qualified with 'shared' may be what you're suggesting matches up with the 'bridge' I'm trying to describe, but again, using the word 'shared' to mean both 'thread safe' and 'not thread safe' doesn't make sense. [...] For essentially all that follows, refer to [1][2] `shared` (as well as `__gshared`) on a variable has the semantics of multiple threads sharing one single memory location for that variable (i.e. it will not be put into thread local storage). Accessing such data directly is inherently not thread safe. Period. You will need some form of synchronization (see [3]) to access such data in a thread safe manner. Now, `shared` is supposed to additionally provide memory barriers, so that reads/writes on such variables are guaranteed not to be reordered in a way that breaks your algorithm; remember, the compiler (and also later the cpu when it reorders the opcode) is allowed to reorder reads/writes to a memory location to be more efficient, as long as doing so won't change the logic as the compiler/or cpu sees it. Example: __gshared int a = 0; // thread 1: a = 1; int b = a; writeln(b); // thread 2: a += 1; In the above, you may expect `b` to be either 1, or 2, depending on how the cpu interleaves the memory access, but it can, in fact, also be 0, since neither the compiler, nor the cpu can detect any reason as to why `a = 1` should need to come before `int b = a` and may thus reorder the write and the read. Memory barriers prevent such reordering in the cpu and if we had made `a` `shared` those barriers would've been supposed to be emitted by the compiler (in addition to not reordering them itself). Unfortunately, that emission is not implemented. From [4]: Non-static member functions can have, in addition to the usual FunctionAttributes, the attributes const, immutable, shared, or inout. These attributes apply to the hidden this parameter. Thus a member function being `shared` means nothing more than that the instance it is called on must also be `shared`, i.e. class Foo { shared void bar(); } Foo foo; foo.bar();// this is illegal, `foo` (the hidden `this` of `bar`) is not shared shared Foo foobar; foobar.bar(); // this is legal, since `foobar` is shared That's it, there are no two meanings of `shared` depending on some context, there is only one: The data in question, which is either the attributed variable, or the object/instance of the member function being attributed, is shared between threads and accessing it directly is not thread safe. On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: I thought 'shared' was a finished feature, but it's starting to seem like it's a WIP. I prefer the term "unfinished" since "WIP" implies that it's being worked on. AFAIK there's no one currently working on implementing what's missing in the compiler frontend with regards to the spec. On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: This kind of feature seems like it has great potential, but is mostly useless in it's current state. I share that opinion and generally either use `__gshared` if I absolutely have to share data via shared memory and design carefully to avoid all the potential issues, or - which I much prefer - use message passing: `std.concurrency` is your friend. On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: After more testing with shared, it seems that 'shared' data is mutable from many contexts, from which it would be unsafe to mutate it without locking first, which basically removes any gauruntee that would make 'shared' useful. As pointed out above that's to be expected, since that's its job. Regarding guarantees: Since D treats data as thread local by default, you need either `shared` or `__gshared` to have mutable shared (intra-process) memory (ignoring OS facilities for inter-process shared memory). The main advantage is not in data being `shared`/`__gshared`, but in the guarantees that all the other (unattributed, thread local) data gets: Each thread has its own copies and any optimisations applied to code that accesses them need not consider multiple threads (I'd wager this is a significant boon towards D's fast compile times). If you only talk about useful benefits of `shared` over `__gshared`, if the spec were properly implemented, the useful properties would include you not needing to worry about memory barriers. Other useful guaranties are the more rigorous type checks, when compared to `__gshared`, which are supposed to prevent you from committing some of the common mistakes occurring in non-sequential programming (see, e.g. the code example with `class Foo` above). On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: Again, tell me if I'm wrong here, but there seems to be a lot of holes in 'shared'. There are holes in the implementation of `shared`; it's spec, however,
Re: Mallocator and 'shared'
On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote: It seems like you're saying that 'shared' should mean both 'thread safe' and 'not thread safe' depending on context, which doesn't make sense. Makes sense to me: A `shared` variable is shared among threads. Accesses are not thread-safe. When a function has a `shared` parameter, it's expected to access that variable in a thread-safe manner. A `shared` method is a function with a `shared` `this` parameter. Considering the alternative, that functions are not expected to ensure thread-safety on their `shared` parameters, that would mean you have to ensure it at the call site. And then what would be the point of marking the parameter `shared`, if thread-safety is already ensured? Example: shared A a; struct A { int x, y; void foo() shared { a.x = 1; } } int main(string[] argv) { a.x = 5; a.y = 5; a.foo(); return 0; } Qualifying 'a' with 'shared' means that it's shared between threads, which means that accessing it is _not_ thread safe. Yup. In my opinion, non-atomic accesses like that should be rejected by the compiler. But you can write race conditions with only atomic stores/loads, so in the end it's the programmer's responsibility to write correct code. Since the method 'foo' accesses 'a', 'foo' is also _not_ thread safe. Well, yes, that `foo` isn't thread-safe. But it should be written differently so that it is. Given that both the data and the method are 'shared', a caller should know that race conditions are possible and that they should aquire a lock before accessing either of them...or so it would seem. But when you have the lock, you can safely call any method, including non-`shared` ones. I see no point in distinguishing `shared` and unshared methods then. Non-`shared` methods are obviously not safe to call on `shared` objects. So `shared` methods must be other thing: safe. I imagine that qualifying a method with 'shared' should mean that it can access shared data, and hence, is _not_ thread safe. Every function/method can access shared data. They're all expected to do it safely. The `shared` attribute just qualifies the `this` reference. This prevent access to 'shared' data from any non 'shared' context, without some kind of bridge/cast that a programmer would use when they knew that they had aquired the lock or ownership of the data. Although this is what would make sense to me, it doesn't seem to match with the current implementation of 'shared', or what you're saying. It wouldn't exactly "prevent" it, would it? The compiler can't check that you've got the correct lock. It would be expected of the programmer to do so before calling the `shared` method. That's easy to get wrong. When `shared` methods are safe themselves, you can't get the calls to them wrong. The ugly is nicely contained. To call an unsafe method, you have to cast and that's a good indicator that you're entering the danger zone. It seems that methods qualified with 'shared' may be what you're suggesting matches up with the 'bridge' I'm trying to describe, but again, using the word 'shared' to mean both 'thread safe' and 'not thread safe' doesn't make sense. Maybe don't think of it meaning "safe" or "unsafe" then. It just means "shared". A `shared` variable is just that: shared. The way you deal with it can be thread-safe or not. Everyone is expected to deal with it safely, though. "Everyone" includes `shared` methods. Firstly, because the same keyword should not mean two strictly opposite things. Also, if a 'shared' method is supposed to be thread safe, then the fact that it has access to shared data is irrelevant to the caller. Non-`shared` methods are not thread-safe. They expect unshared data. You can still call them on shared objects, though, with a cast. And when you've ensured thread-safety beforehand, it's perfectly fine to do so. If `shared` methods were unsafe too, then that would only allow calling unsafe code without a cast. Doesn't seem like an improvement. So 'shared' as a method qualifier doesn't really make sense. What would make more sense is to have a region where 'shared' data could be accessed - Maybe something like this: struct S { shared int x; Lock lk; private void addNum(int n) shared { x += num; } int add(int a, int b) { shared { lk.lock(); addNum(a); addNum(b); lk.unlock(); } } } So above, 1) 'x' would be shared, and mutating it would not thread safe. As it is now. 2) 'addNum' would have access to 'shared' data, and also be non-thread-safe Today, non-`shared` methods are unsafe, and they can access shared data just like `shared` methods. But I imagine you'd have `shared` methods alter `shared` data freely, without casting. 3) 'x' and 'addNum' would be inaccessible from 'add' since they're 'shared' A
Re: Can this implementation of Damm algorithm be optimized?
On Sunday, 12 February 2017 at 05:54:34 UTC, Era Scarecrow wrote: On Saturday, 11 February 2017 at 21:56:54 UTC, Era Scarecrow wrote: Just ran the unittests under the dmd profiler, says the algorithm is 11% faster now. So yeah slightly more optimized. Ran some more tests. Without optimization but with with 4 levels (a 2.5Mb table), it gains to a whopping 27%! However with optimizations turned on it dwindled to a mere 15% boost And optimization + no bounds checking, 2 & 4 levels both give a 9% boost total. Testing purely on 8byte inputs (Brute forced all combinations) receives the same 9% boost with negligible difference. Safe to say going higher levels isn't going to give you sufficient improvement; Also exe file is 3Mb big (but compresses to 150k). Wow! Thanks for the interest and effort.
Re: Can this implementation of Damm algorithm be optimized?
On Monday, 13 February 2017 at 00:56:37 UTC, Nestor wrote: On Sunday, 12 February 2017 at 05:54:34 UTC, Era Scarecrow wrote: Ran some more tests. Wow! Thanks for the interest and effort. Certainly. But the bulk of the answer comes down that the 2 levels that I've already provided are the fastest you're probably going to get. Certainly we can test using shorts or bytes instead, but it's likely the results will only go down. To note my tests are strictly on my x86 system and it would be better to also test this on other systems like PPC, Linux, ARM, and other architectures to see how they perform, and possibly tweak them as appropriate. Still we did find out there is some optimization that can be done and successfully for the Damm algorithm, it just isn't going to be a lot. Hmmm... A thought does come to mind. Parallelizing the code; However that would require probably 11 instances to get a 2x speedup (calculating the second half with all 10 possibilities for the carry over, and also calculating the first half, then choosing which of the 10 based on the first half's output), which only really works if you have a ton of cores, and the input is REALLY REALLY large, like a meg or something. While the usage of the Damm code is more useful for adding a digit to the end of a code like UPC or Barcodes as error detection, and expecting larger than 32 for real applications is unlikely. But at this point I'm rambling.
Re: Mallocator and 'shared'
On 02/13/2017 01:27 AM, Moritz Maxeiner wrote: __gshared int a = 0; // thread 1: a = 1; int b = a; writeln(b); // thread 2: a += 1; In the above, you may expect `b` to be either 1, or 2, depending on how the cpu interleaves the memory access, but it can, in fact, also be 0, since neither the compiler, nor the cpu can detect any reason as to why `a = 1` should need to come before `int b = a` and may thus reorder the write and the read. This doesn't make sense to me. b depends on a. If I run thread 1 alone, I can expect b to be 1, no? Thread 2 can then a) read 0, write 1; or b) read 1, write 2. How can b be 0 when the writeln is executed? An example like this makes more sense to me: shared int a = 0, b = 0; // Thread 1: a = 1; b = 2; // Thread 2: writeln(a + b); One might think that thread 2 cannot print "2", because from the order of statements the numbers must be 0 + 0 or 1 + 0 or 1 + 2. But thread 1 might execute `b = 2` before `a = 1`, because the order doesn't matter to thread 1. So 0 + 2 is possible, too.
Re: Mallocator and 'shared'
On Monday, 13 February 2017 at 01:30:57 UTC, ag0aep6g wrote: This doesn't make sense to me. b depends on a. If I run thread 1 alone, I can expect b to be 1, no? Thread 2 can then a) read 0, write 1; or b) read 1, write 2. How can b be 0 when the writeln is executed? An example like this makes more sense to me: shared int a = 0, b = 0; // Thread 1: a = 1; b = 2; // Thread 2: writeln(a + b); One might think that thread 2 cannot print "2", because from the order of statements the numbers must be 0 + 0 or 1 + 0 or 1 + 2. But thread 1 might execute `b = 2` before `a = 1`, because the order doesn't matter to thread 1. So 0 + 2 is possible, too. You're right, of course, and I shall do well to remember not to think up examples for non-sequential code at such an hour, I am sorry. Thank you for providing a correct example plus explanation. The rest of my post still stands, though. In recompense I shall provide another example, this one translated from Wikipedia[1] instead: __gshared int f = 0, x = 0; // thread 1 while (f == 0); // Memory barrier required here writeln(x) // thread 2 x = 42; // Memory barrier required here f = 1; The above demonstrates a case where you do need a memory barrier: thread 1 and 2 have a consumer/producer relationship, where thread 1 wants to consume a value from thread 2 via `x`, using `f` to be notified that `x` is ready to be consumed; Without memory barriers at both of the indicated lines the cpu is free to reorder either thread: The first is required so that thread 1 doesn't get reordered to consume before being notified and the second so that thread 2 doesn't get reordered to signal thread 1 before having produced. If we had made `f` and `x` `shared` instead of `__gshared` the spec would require (at least) the two indicated memory barriers being emitted. Currently, though, it won't and for this case (AFAIK) `shared` won't get you any benefit I'm aware of over `__gshared`. You'll still need to add those memory barriers (probably using inline assembler, though I'm not sure what's the best way is in D, since I usually just stick with message passing). [1] https://en.wikipedia.org/wiki/Memory_barrier