Re: Variable-Length Bit-Level Encoding
On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: 0 => [0] 1 => [1,0] 2 => [1,1,0] is easy but assumes a too extreme input value distribution. Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? Hmm off hand I'm recalling there's a different encoding that uses 0's to determine how long the encoding is, and then the first 1 breaks it off to the actual encoding. Great for short ints. 1 = 0 010 = 1 011 = 2 00101 = 4 1 = 30 https://en.wikipedia.org/wiki/Exponential-Golomb_coding If you know the range of the data, I'd almost be more inclined to do range encoding mixed with arithmetic coding.
Re: Is it possible to store different generic types in ex. an
On 11/09/2016 07:46 AM, Is it possible to store different generic types? via Digitalmars-d-learn wrote: On Wednesday, 9 November 2016 at 15:44:59 UTC, Is it possible to store different generic types? wrote: Is it possible to store different generic types in ex. somekind of container such as an array, hashtable etc. Let's say we got class Foo(T) { ... } Would it be possible to store something like Foo[] foos; // Where Foo of course should allow any generic version of Foo Ex. Foo!int and Foo!string should both be able to be stored inside the array. If so how would one construct an array like that or is there some other container that may be able to do it? I'm aware that I could have Foo inherit a base class and then use casting, but I would like to avoid casting if possible. You don't necessarily have to do casting if all of the objects that you are going to store use the same methods for the purposes that you are going to use them. Just craft an interface that they all implement and inherit from that. But if they don't all implement all the methods you are going to need, you can't avoid casting unless you do something like: 1) create a dummy class (I don't *think* an abstract class would work) that implements all the methods you are going to need with dummy methods. 2) inherit all the classes you are going to use from that class 3) be sure to use override on the methods that are actually implemented. Even then I'm not sure this would work. Since you are doing something like: Dummy[Key] stuff; when you retrieve an item from stuff, what you retrieve will be the actual item, but the system will probably think it's a Dummy unless you properly cast it. So go with the interface approach, even if that means implementing dummy methods for some of the classes. (OTOH, I haven't actually *tried* that inherit from a common class...it might work. I'd just bet against it without casting.) The basic idea here is that all the objects that you store have to implement all the methods that you are going to use on any of them, but the implementations can be something like: void method1(){assert (false, "You should never come here"); }
Re: Variable-Length Bit-Level Encoding
On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? I don't recall the name, but there is an algorithm for encoding data of an arbitrary number of bytes/bits into a stream of octets. It reserves the MSB of each octet for use as a marker. If it is set to 1, then there are yet more bits to read. If it is set to 0, then this is the last group of bits. This enables each octet to carry 7 bits at a time, while allowing you to encode data of any bit size into an octet stream. You just need to break your data down into groups of 7 bits.
Variable-Length Bit-Level Encoding
I'm looking for libraries/snippets (either in D or similar languages) that perform variable-length encoding of unsigned integers onto a bit-stream. Requirement is that smaller inputs (integer values) should be encoded with equal or fewer bits. This 0 => [0] 1 => [1,0] 2 => [1,1,0] is easy but assumes a too extreme input value distribution. Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed?
Re: Variable-Length Bit-Level Encoding
On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? Doh, forget the normal distribution thing here. It, of course, doesn't work with unsigned integers.
Re: Is it possible to store different generic types in ex. an
On Wednesday, 9 November 2016 at 15:44:59 UTC, Is it possible to store different generic types? wrote: Foo[] foos; // Where Foo of course should allow any generic version of Foo You can use an array of std.variant
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 13:11:02 UTC, Mike Parker wrote: void func2(int[] foo) { foo ~= 10; } Sorry, this should be (ref int[] foo)
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 12:16:02 UTC, Andrew wrote: Bear in mind that static arrays are implicitly sliced when passed to or returned from a function anywhere a dynamic array is expected. If you modify f() to look like this: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } return sa; } I thought that if you changed the function signature to int[10] f() to return a static array, this should be returned by value, and so should be safe to use, but it seems that this too points to the same memory. If the return type is declared as int[10], then yes, returning sa should copy all 10 elements in the return. I just verified what you said and also found that both have the same address. However, if you stomp the stack and then access any value from the array, you'll find that it's valid. When returning int[], this is not the case. This looks to me like an optimization by the compiler to avoid the copy, but I know nothing of the implementation details. And for clarity, I think we should be careful with the term 'by value', as dynamic arrays are passed around by value, not by reference, just without the members being copied around. Keep in mind that a dynamic array is a length and a pointer. Those are not passed by reference, but by value. So that this: void func1(int[] foo) { foo ~= 10; } And this: void func2(int[] foo) { foo ~= 10; } Are not the same. func1 appends to the local slice, meaning it is no longer connected to the original that was passed in. func2 appends to the original array. To make it safe, should I be using: auto sb = f().dup; Yes, if you intend to keep the contents around indefinitely, this is the proper way to do it. My point in my previous post was just demonstrating that sb remains valid as long as its stack memory has not yet been overwritten and, as long as it is made use of immediately, nothing is going to break.
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 11:03:31 UTC, Mike Parker wrote: [...] You *have* created a dangling pointer. It's just that for such a simple little program, the part of the stack where the original array was allocated isn't stomped at the point where you access it after the function call. The same sort of program will also work in C, where it's not uncommon for functions to return string literals that are immediately printed to stdout or a log file. One difference from C in this case is that it's still possible to make things work even after the stack has been stomped and the memory is no longer valid simply by increasing the length of the returned slice: sb ~= 0, or perhaps sb.length+=n. This will cause an allocation and sb will then own the memory block to which it points. Bear in mind that static arrays are implicitly sliced when passed to or returned from a function anywhere a dynamic array is expected. If you modify f() to look like this: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } return sa; } I thought that if you changed the function signature to int[10] f() to return a static array, this should be returned by value, and so should be safe to use, but it seems that this too points to the same memory. To make it safe, should I be using: auto sb = f().dup; Thanks Andrew
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 11:03:31 UTC, Mike Parker wrote: Thank you very much for your clarifications & explanations, I am reassured to see that things work like in C. I will also look the links you provided, thank you again for your time. Vincent
Re: static array internal & dangling reference
Thank you for your answer cym13. I reproduced your result for: On Saturday, 12 November 2016 at 10:45:23 UTC, cym13 wrote: void f_test() { auto sb=f(); sb[2] = 100; writeln(sb[2]); // prints 100 int test[100]; writeln(sb[2]); // prints 0 } now I am convinced of the invalid access, and this time valgrind was not happy, insulting me with: ==13545== Conditional jump or move depends on uninitialised value(s) ==13545==at 0x10A6BA: _D3std4conv55__T11toTextRangeTiTS3std5stdio4File17LockingTextWriterZ11toTextRangeFNfiS3std5stdio4File17LockingTextWriterZv (indexType.d:5123) ==13545==by 0x10ABED: _D3std5stdio4File14__T5writeTiTaZ5writeMFNfiaZv (stdio.d:1424) ==13545==by 0x10A08C: _D3std5stdio14__T7writelnTiZ7writelnFNfiZv (stdio.d:3211) However for my first example, I have checked again and void f_test() { auto sb=f(); sb[2]=100; } Valgrind is silent... I thought it was because of a compiler optimization that removed the unusued "sb[2]=100" statement, but even with: void f_test() { auto sb=f(); sb[2]=100; writeln(sb[2]); } which prints 100, Valgrind still do not complain... ?!? I will try to investigate this and give a feedback if I find an explanation. Vincent
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 10:33:05 UTC, Picaud Vincent wrote: Hi all, Still learning... This time what surprised me is how static arrays work. I assume (is it true?) that for efficiency reason static size arrays like int[10] are on the stack and do not involve dynamic memory allocation: Yes, they are on the stack. That's what the 'static' in 'static array' implies. The same is true in C and C++ (and other programming languages that distinguish between static and dynamic arrays). It's why their length is fixed and known at compile time. First surprise: it is possible to share a static array: void main() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } writeln("\n vect init sa",sa); int[] sb=sa[3..6]; // <- did not think it was possible sb[2]=100; writeln("\n vect sa ",sa); writeln("\n vect sb ",sb); } which prints: vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9] // <- sa and sb are really shared sb is a slice of sa. A slice *always* shares the memory block backing the original array until the slice is reallocated, which on a fresh slice of an existing array is pretty much always going to happen as soon as you append to it. Second surprise: I tried to create a dangling reference with: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } int[] sb=sa; return sb; } void f_test() { auto sb=f(); sb[2]=100; // I expected an "invalid access" (it points to "sa", a local variable) } However calling f_test seems ok and valgrind tool does not complain... So my questions are: 0/ does I miss something? (in C++ for sure you create a dangling pointer) 1/ what is the internal mechanism for that? Is the GC involved? Performance impact? 2/ any link describing this in more details is welcome You *have* created a dangling pointer. It's just that for such a simple little program, the part of the stack where the original array was allocated isn't stomped at the point where you access it after the function call. The same sort of program will also work in C, where it's not uncommon for functions to return string literals that are immediately printed to stdout or a log file. One difference from C in this case is that it's still possible to make things work even after the stack has been stomped and the memory is no longer valid simply by increasing the length of the returned slice: sb ~= 0, or perhaps sb.length+=n. This will cause an allocation and sb will then own the memory block to which it points. Bear in mind that static arrays are implicitly sliced when passed to or returned from a function anywhere a dynamic array is expected. If you modify f() to look like this: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } return sa; } You will now get an error about an escaping reference. If you haven't read it already, I suggest taking a look at Steven's array article [1]. I also answer these questions in Chapter 2 of Learning D [2], and I'm pretty sure Ali talks about it somewhere in 'Programming in D' [3]. [1] https://dlang.org/d-array-article.html [2] https://www.packtpub.com/application-development/learning-d [3] http://ddili.org/ders/d.en/index.html
Re: static array internal & dangling reference
On Saturday, 12 November 2016 at 10:33:05 UTC, Picaud Vincent wrote: Hi all, Still learning... This time what surprised me is how static arrays work. I assume (is it true?) that for efficiency reason static size arrays like int[10] are on the stack and do not involve dynamic memory allocation: Yes, they're on the stack. They're just like a C-array: a plain region with items in sequence. First surprise: it is possible to share a static array: void main() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } writeln("\n vect init sa",sa); int[] sb=sa[3..6]; // <- did not think it was possible sb[2]=100; writeln("\n vect sa ",sa); writeln("\n vect sb ",sb); } Note that sb is a slice: it's just a pointer and a length. A pointer to what? To the third element of sa because that's the beginning of the slice. which prints: vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9] // <- sa and sb are really shared Well, sure, sb is a pointer to part of sa so you're accessing sa by reference when making use of sb (well, with an offset that is). vect sb [3, 4, 100] Second surprise: I tried to create a dangling reference with: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } int[] sb=sa; return sb; } void f_test() { auto sb=f(); sb[2]=100; // I expected an "invalid access" (it points to "sa", a local variable) } However calling f_test seems ok and valgrind tool does not complain... Printing the address of sa[2] and sb[2] in f() as well as the address of sb[2] in f_test() we see that they're all the same. So yes you are using a dangling reference. It only gives you the good result because you haven't overwritten that part of the stack yet. Demonstration: void f_test() { auto sb=f(); sb[2] = 100; writeln(sb[2]); // prints 100 int test[100]; writeln(sb[2]); // prints 0 } I don't know why valgrind isn't panicking. So my questions are: 0/ does I miss something? (in C++ for sure you create a dangling pointer) 1/ what is the internal mechanism for that? Is the GC involved? Performance impact? 2/ any link describing this in more details is welcome Thank you
static array internal & dangling reference
Hi all, Still learning... This time what surprised me is how static arrays work. I assume (is it true?) that for efficiency reason static size arrays like int[10] are on the stack and do not involve dynamic memory allocation: First surprise: it is possible to share a static array: void main() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } writeln("\n vect init sa",sa); int[] sb=sa[3..6]; // <- did not think it was possible sb[2]=100; writeln("\n vect sa ",sa); writeln("\n vect sb ",sb); } which prints: vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9] // <- sa and sb are really shared vect sb [3, 4, 100] Second surprise: I tried to create a dangling reference with: int[] f() { int[10] sa; foreach(int i, ref sa_i;sa){ sa_i=i; } int[] sb=sa; return sb; } void f_test() { auto sb=f(); sb[2]=100; // I expected an "invalid access" (it points to "sa", a local variable) } However calling f_test seems ok and valgrind tool does not complain... So my questions are: 0/ does I miss something? (in C++ for sure you create a dangling pointer) 1/ what is the internal mechanism for that? Is the GC involved? Performance impact? 2/ any link describing this in more details is welcome Thank you