Re: Variable-Length Bit-Level Encoding

2016-11-12 Thread Era Scarecrow via Digitalmars-d-learn

On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote:

0 => [0]
1 => [1,0]
2 => [1,1,0]

is easy but assumes a too extreme input value distribution.

Does anybody have a suggestion for an encoder that is more 
suitable for real-world values that are, for instance, normally 
distributed?


 Hmm off hand I'm recalling there's a different encoding that 
uses 0's to determine how long the encoding is, and then the 
first 1 breaks it off to the actual encoding. Great for short 
ints.


 1 = 0
 010 = 1
 011 = 2
 00101 = 4
 1 = 30

 https://en.wikipedia.org/wiki/Exponential-Golomb_coding

 If you know the range of the data, I'd almost be more inclined 
to do range encoding mixed with arithmetic coding.


Re: Is it possible to store different generic types in ex. an

2016-11-12 Thread Charles Hixson via Digitalmars-d-learn
On 11/09/2016 07:46 AM, Is it possible to store different generic types? 
via Digitalmars-d-learn wrote:
On Wednesday, 9 November 2016 at 15:44:59 UTC, Is it possible to store 
different generic types? wrote:
Is it possible to store different generic types in ex. somekind of 
container such as an array, hashtable etc.


Let's say we got

class Foo(T) {
...
}

Would it be possible to store something like

Foo[] foos; // Where Foo of course should allow any generic version 
of Foo


Ex.

Foo!int and Foo!string should both be able to be stored inside the 
array.


If so how would one construct an array like that or is there some 
other container that may be able to do it?


I'm aware that I could have Foo inherit a base class and then use 
casting, but I would like to avoid casting if possible.


You don't necessarily have to do casting if all of the objects that you 
are going to store use the same methods for the purposes that you are 
going to use them.  Just craft an interface that they all implement and 
inherit from that.  But if they don't all implement all the methods you 
are going to need, you can't avoid casting unless you do something like:
1) create a dummy class (I don't *think* an abstract class would work) 
that implements all the methods you are going to need with dummy methods.

2) inherit all the classes you are going to use from that class
3) be sure to use override on the methods that are actually implemented.

Even then I'm not sure this would work.  Since you are doing something like:
Dummy[Key]  stuff;
 when you retrieve an item from stuff, what you retrieve will be the 
actual item, but the system will probably think it's a Dummy unless you 
properly cast it.  So go with the interface approach, even if that means 
implementing dummy methods for some of the classes. (OTOH, I haven't 
actually *tried* that inherit from a common class...it might work.  I'd 
just bet against it without casting.)


The basic idea here is that all the objects that you store have to 
implement all the methods that you are going to use on any of them, but 
the implementations can be something like:

void method1(){assert (false, "You should never come here"); }



Re: Variable-Length Bit-Level Encoding

2016-11-12 Thread James Buren via Digitalmars-d-learn

On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote:
Does anybody have a suggestion for an encoder that is more 
suitable for real-world values that are, for instance, normally 
distributed?


I don't recall the name, but there is an algorithm for encoding 
data of an arbitrary number of bytes/bits into a stream of 
octets. It reserves the MSB of each octet for use as a marker. If 
it is set to 1, then there are yet more bits to read. If it is 
set to 0, then this is the last group of bits. This enables each 
octet to carry 7 bits at a time, while allowing you to encode 
data of any bit size into an octet stream. You just need to break 
your data down into groups of 7 bits.




Variable-Length Bit-Level Encoding

2016-11-12 Thread Nordlöw via Digitalmars-d-learn
I'm looking for libraries/snippets (either in D or similar 
languages) that perform variable-length encoding of unsigned 
integers onto a bit-stream. Requirement is that smaller inputs 
(integer values) should be encoded with equal or fewer bits.


This

0 => [0]
1 => [1,0]
2 => [1,1,0]

is easy but assumes a too extreme input value distribution.

Does anybody have a suggestion for an encoder that is more 
suitable for real-world values that are, for instance, normally 
distributed?


Re: Variable-Length Bit-Level Encoding

2016-11-12 Thread Nordlöw via Digitalmars-d-learn

On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote:
Does anybody have a suggestion for an encoder that is more 
suitable for real-world values that are, for instance, normally 
distributed?


Doh, forget the normal distribution thing here. It, of course, 
doesn't work with unsigned integers.


Re: Is it possible to store different generic types in ex. an

2016-11-12 Thread Erik van Velzen via Digitalmars-d-learn
On Wednesday, 9 November 2016 at 15:44:59 UTC, Is it possible to 
store different generic types? wrote:


Foo[] foos; // Where Foo of course should allow any generic 
version of Foo




You can use an array of std.variant


Re: static array internal & dangling reference

2016-11-12 Thread Mike Parker via Digitalmars-d-learn

On Saturday, 12 November 2016 at 13:11:02 UTC, Mike Parker wrote:


void func2(int[] foo) { foo ~= 10; }


Sorry, this should be (ref int[] foo)


Re: static array internal & dangling reference

2016-11-12 Thread Mike Parker via Digitalmars-d-learn

On Saturday, 12 November 2016 at 12:16:02 UTC, Andrew wrote:



Bear in mind that static arrays are implicitly sliced when 
passed to or returned from a function anywhere a dynamic array 
is expected. If you modify f() to look like this:


int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  return sa;
}



I thought that if you changed the function signature to int[10] 
f() to return a static array, this should be returned by value, 
and so should be safe to use, but it seems that this too points 
to the same memory.


If the return type is declared as int[10], then yes, returning sa 
should copy all 10 elements in the return. I just verified what 
you said and also found that both have the same address. However, 
if you stomp the stack and then access any value from the array, 
you'll find that it's valid. When returning int[], this is not 
the case. This looks to me like an optimization by the compiler 
to avoid the copy, but I know nothing of the implementation 
details.


And for clarity, I think we should be careful with the term 'by 
value', as dynamic arrays are passed around by value, not by 
reference, just without the members being copied around. Keep in 
mind that a dynamic array is a length and a pointer. Those are 
not passed by reference, but by value. So that this:


void func1(int[] foo) { foo ~= 10; }

And this:

void func2(int[] foo) { foo ~= 10; }

Are not the same. func1 appends to the local slice, meaning it is 
no longer connected to the original that was passed in. func2 
appends to the original array.




To make it safe, should I be using:

auto sb = f().dup;


Yes, if you intend to keep the contents around indefinitely, this 
is the proper way to do it. My point in my previous post was just 
demonstrating that sb remains valid as long as its stack memory 
has not yet been overwritten and, as long as it is made use of 
immediately, nothing is going to break.




Re: static array internal & dangling reference

2016-11-12 Thread Andrew via Digitalmars-d-learn

On Saturday, 12 November 2016 at 11:03:31 UTC, Mike Parker wrote:

[...]


You *have* created a dangling pointer. It's just that for such 
a simple little program, the part of the stack where the 
original array was allocated isn't stomped at the point where 
you access it after the function call. The same sort of program 
will also work in C, where it's not uncommon for functions to 
return string literals that are immediately printed to stdout 
or a log file.


One difference from C in this case is that it's still possible 
to make things work even after the stack has been stomped and 
the memory is no longer valid simply by increasing the length 
of the returned slice: sb ~= 0, or perhaps sb.length+=n. This 
will cause an allocation and sb will then own the memory block 
to which it points.


Bear in mind that static arrays are implicitly sliced when 
passed to or returned from a function anywhere a dynamic array 
is expected. If you modify f() to look like this:


int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  return sa;
}



I thought that if you changed the function signature to int[10] 
f() to return a static array, this should be returned by value, 
and so should be safe to use, but it seems that this too points 
to the same memory.


To make it safe, should I be using:

auto sb = f().dup;

Thanks

Andrew


Re: static array internal & dangling reference

2016-11-12 Thread Picaud Vincent via Digitalmars-d-learn

On Saturday, 12 November 2016 at 11:03:31 UTC, Mike Parker wrote:

Thank you very much for your clarifications & explanations, I am 
reassured to see that things work like in C.


I will also look the links you provided, thank you again for your 
time.


Vincent





Re: static array internal & dangling reference

2016-11-12 Thread Picaud Vincent via Digitalmars-d-learn

Thank you for your answer cym13.

I reproduced your result for:

On Saturday, 12 November 2016 at 10:45:23 UTC, cym13 wrote:

void f_test() {
auto sb=f();
sb[2] = 100;
writeln(sb[2]); // prints 100
int test[100];
writeln(sb[2]); // prints 0
}


now I am convinced of the invalid access, and this time valgrind 
was not happy, insulting me with:


==13545== Conditional jump or move depends on uninitialised 
value(s)
==13545==at 0x10A6BA: 
_D3std4conv55__T11toTextRangeTiTS3std5stdio4File17LockingTextWriterZ11toTextRangeFNfiS3std5stdio4File17LockingTextWriterZv (indexType.d:5123)
==13545==by 0x10ABED: 
_D3std5stdio4File14__T5writeTiTaZ5writeMFNfiaZv (stdio.d:1424)
==13545==by 0x10A08C: 
_D3std5stdio14__T7writelnTiZ7writelnFNfiZv (stdio.d:3211)



However for my first example, I have checked again and

void f_test()
{
  auto sb=f();
  sb[2]=100;
}

Valgrind is silent... I thought it was because of a compiler 
optimization that removed the unusued "sb[2]=100" statement, but 
even with:



void f_test()
{
  auto sb=f();
  sb[2]=100;
  writeln(sb[2]);
}

which prints 100, Valgrind still do not complain... ?!?

I will try to investigate this and give a feedback if I find an 
explanation.


Vincent


Re: static array internal & dangling reference

2016-11-12 Thread Mike Parker via Digitalmars-d-learn
On Saturday, 12 November 2016 at 10:33:05 UTC, Picaud Vincent 
wrote:

Hi all,
Still learning... This time what surprised me is how static 
arrays work.
I assume (is it true?) that for efficiency reason static size 
arrays like int[10] are on the stack and do not involve dynamic 
memory allocation:


Yes, they are on the stack. That's what the 'static' in 'static 
array' implies. The same is true in C and C++ (and other 
programming languages that distinguish between static and dynamic 
arrays). It's why their length is fixed and known at compile time.




First surprise: it is possible to share a static array:

void main() {

  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  writeln("\n vect init sa",sa);

  int[] sb=sa[3..6];   // <- did not think it was possible
  sb[2]=100;

  writeln("\n vect sa ",sa);
  writeln("\n vect sb ",sb);
}

which prints:

 vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

 vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9]  // <- sa and sb are 
really shared


sb is a slice of sa. A slice *always* shares the memory block 
backing the original array until the slice is reallocated, which 
on a fresh slice of an existing array is pretty much always going 
to happen as soon as you append to it.




Second surprise: I tried to create a dangling reference with:

int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  int[] sb=sa;
  return sb;
}

void f_test()
{
  auto sb=f();
  sb[2]=100; // I expected an "invalid access" (it points to 
"sa", a local variable)

}

However calling f_test seems ok and valgrind tool does not 
complain...


So my questions are:
0/ does I miss something? (in C++ for sure you create a 
dangling pointer)
1/ what is the internal mechanism for that? Is the GC involved? 
Performance impact?

2/ any link describing this in more details is welcome



You *have* created a dangling pointer. It's just that for such a 
simple little program, the part of the stack where the original 
array was allocated isn't stomped at the point where you access 
it after the function call. The same sort of program will also 
work in C, where it's not uncommon for functions to return string 
literals that are immediately printed to stdout or a log file.


One difference from C in this case is that it's still possible to 
make things work even after the stack has been stomped and the 
memory is no longer valid simply by increasing the length of the 
returned slice: sb ~= 0, or perhaps sb.length+=n. This will cause 
an allocation and sb will then own the memory block to which it 
points.


Bear in mind that static arrays are implicitly sliced when passed 
to or returned from a function anywhere a dynamic array is 
expected. If you modify f() to look like this:


int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  return sa;
}

You will now get an error about an escaping reference.

If you haven't read it already, I suggest taking a look at 
Steven's array article [1]. I also answer these questions in 
Chapter 2 of Learning D [2], and I'm pretty sure Ali talks about 
it somewhere in 'Programming in D' [3].


[1] https://dlang.org/d-array-article.html
[2] https://www.packtpub.com/application-development/learning-d
[3] http://ddili.org/ders/d.en/index.html


Re: static array internal & dangling reference

2016-11-12 Thread cym13 via Digitalmars-d-learn
On Saturday, 12 November 2016 at 10:33:05 UTC, Picaud Vincent 
wrote:

Hi all,
Still learning... This time what surprised me is how static 
arrays work.
I assume (is it true?) that for efficiency reason static size 
arrays like int[10] are on the stack and do not involve dynamic 
memory allocation:


Yes, they're on the stack. They're just like a C-array: a plain 
region with items in sequence.



First surprise: it is possible to share a static array:

void main() {

  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  writeln("\n vect init sa",sa);

  int[] sb=sa[3..6];   // <- did not think it was possible
  sb[2]=100;

  writeln("\n vect sa ",sa);
  writeln("\n vect sb ",sb);
}


Note that sb is a slice: it's just a pointer and a length. A 
pointer to what? To the third element of sa because that's the 
beginning of the slice.



which prints:

 vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

 vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9]  // <- sa and sb are 
really shared


Well, sure, sb is a pointer to part of sa so you're accessing sa 
by reference when making use of sb (well, with an offset that is).



 vect sb [3, 4, 100]

Second surprise: I tried to create a dangling reference with:

int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  int[] sb=sa;
  return sb;
}

void f_test()
{
  auto sb=f();
  sb[2]=100; // I expected an "invalid access" (it points to 
"sa", a local variable)

}

However calling f_test seems ok and valgrind tool does not 
complain...


Printing the address of sa[2] and sb[2] in f() as well as the 
address of sb[2] in f_test() we see that they're all the same. So 
yes you are using a dangling reference. It only gives you the 
good result because you haven't overwritten that part of the 
stack yet. Demonstration:


void f_test() {
auto sb=f();
sb[2] = 100;
writeln(sb[2]); // prints 100
int test[100];
writeln(sb[2]); // prints 0
}

I don't know why valgrind isn't panicking.


So my questions are:
0/ does I miss something? (in C++ for sure you create a 
dangling pointer)
1/ what is the internal mechanism for that? Is the GC involved? 
Performance impact?

2/ any link describing this in more details is welcome

Thank you





static array internal & dangling reference

2016-11-12 Thread Picaud Vincent via Digitalmars-d-learn

Hi all,
Still learning... This time what surprised me is how static 
arrays work.
I assume (is it true?) that for efficiency reason static size 
arrays like int[10] are on the stack and do not involve dynamic 
memory allocation:


First surprise: it is possible to share a static array:

void main() {

  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  writeln("\n vect init sa",sa);

  int[] sb=sa[3..6];   // <- did not think it was possible
  sb[2]=100;

  writeln("\n vect sa ",sa);
  writeln("\n vect sb ",sb);
}

which prints:

 vect init sa[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

 vect sa [0, 1, 2, 3, 4, 100, 6, 7, 8, 9]  // <- sa and sb are 
really shared


 vect sb [3, 4, 100]

Second surprise: I tried to create a dangling reference with:

int[] f()
{
  int[10] sa;
  foreach(int i, ref sa_i;sa){
sa_i=i;
  }
  int[] sb=sa;
  return sb;
}

void f_test()
{
  auto sb=f();
  sb[2]=100; // I expected an "invalid access" (it points to 
"sa", a local variable)

}

However calling f_test seems ok and valgrind tool does not 
complain...


So my questions are:
0/ does I miss something? (in C++ for sure you create a dangling 
pointer)
1/ what is the internal mechanism for that? Is the GC involved? 
Performance impact?

2/ any link describing this in more details is welcome

Thank you