Re: internal representation of struct

2011-03-18 Thread Simon Buerger

On 18.03.2011 17:34, lenochware wrote:

Hello, I have array of type vertex_t vertices[] where vertex_t is:

struct vertex_t {
   float[3] xyz;
   ubyte[4] color;
   ...
}

Now, I would like use instead of array float[3] xyz vec3f xyz, where vec3f 
is:

struct vec3f {
   float x, y, z;

...some functions...
}

where I have defined some operators for adding and multipliing vectors etc.

Is it possible? xyz must be exactly 3 floats in memory because I am sending
vertices array pointer into OpenGL rendering pipeline.
Will be struct vec3f represented just like 3 floats - aren't here some
additional metadata - because structure contains also some functions?


There is no metadata in structs. The only thing there is padding in 
order to align the fields. But floats have a size of 4 and alignment 
the same, so your vec3f will be exactly 12 bytes without any problem 
(done it myself for OpenGL)



My additional question: What about effectivity? I would like keep struct
vertex_t as simple as possible, because I need do calculations fast. Will have
replacing float array to structure some impact on speed?


depends: When you want to to a component-wise vector multiplication, 
you want the compiler to use SSE and alike. With

float[3] a,b,c;
c[] = a[]*b[];
That will probably work (that is one reason the []-notion exists in 
the first place). But with

vec3f a,b,c;
c.x = a.x*b.x; c.y=a.y*b.y; c.z=a.z*b.z;
it really depends on the quality of the compiler if it sees the 
opportunity for optimisation.


If you want to be sure, I suggest the following:
struct vec3
{
float[3] data;
vec3 opMul(vec3 other) {
vec3 tmp = this;
tmp.data[] *= other.data[];
}
}

As a last note: When you do OpenGL, the biggest part of calculations 
should be done on the GPU, so the few on the CPU might not be 
performance critical at all. But that depends of course on the exact 
thing you want to do.


- Krox



context-free grammar

2011-03-04 Thread Simon Buerger
It is often said that D's grammar is easier to parse than C++, i.e. it 
should be possible to seperate syntactic and semantic analysis, which 
is not possible in C++ with the template-  and so on. But I found 
following example:


The Line a * b = c; can be interpreted in two ways:
- Declaration of variable b of type a*
- (a*b) is itself a lvalue which is assigned to.

Current D (gdc 2.051) interprets it always in the first way and yields 
an error if the second is meant. The Workaround is simply to use 
parens like (a*b)=c, so it's not a real issue. But at the same time, 
C++ (gcc 4.5) has no problem to distinguish it even without parens.


So, is the advertising as context-free grammar wrong?

- Krox


Re: Should conversion of mutable return value to immutable allowed?

2011-02-24 Thread Simon Buerger

On 24.02.2011 19:08, Ali Çehreli wrote:

Implicit conversions to immutable in the following two functions feel
harmless. Has this been discussed before?

string foo()
{
char[] s;
return s; // Error: cannot implicitly convert expression
// (s) of type char[] to string
}

string bar()
{
char[] s;
return s ~ s; // Error: cannot implicitly convert expression
// (s ~ s) of type char[] to string
}

Is there a reason why that's not possible? I am sure there must be
other cases that at least I would find harmless. :)

Ali


Currently, the correct way to do it is to use the phobos function 
assumeUnique, like:


string bar()
{
char[] s;
return assumeUnique(s);
}

Note that this does only little more than casting to immutable, so you 
have to ensure there is no mutable reference left behind.


Anyway, it might be nice if the compiler could detect some trivial 
cases and insert the cast appropriately. But on the other hand, the 
compiler will never to be able to auto-detect all cases, and so its 
cleaner to use assumeUnique explicitly.


- Krox


toHash /opCmp for builtin-types

2011-02-21 Thread Simon Buerger
Following came to my mind while coding some generic collection 
classes: The toHash and opCmp operations are not supported for 
builtin-types though their implementation is trivial.


* toHash
The code is already there inside TypeInfo.getHash. But 
typeid(value).getHash(value) is much uglier than value.toHash. Note 
that hashes make sense for integer (trivial implementation), not 
necessarily for floats.


* opCmp
Would be useful for delegating opCmp of a struct to one member. 
Alternative: Introduce new operator which returns 1/0/-1 (ruby does 
this with =). Currently I end up writing:


int opCmp(...)
{
if(ab) return +1;
if(a==b) return 0;
return -1;
}

which uses 2 comparisons where only 1 is needed (though the compiler 
might notice it if comparision is pure and so on).


Furthermore it might me a nice idea to have toString (or the future 
writeTo) for builtin-types. It would need some new code in the 
core-lib, but could simplify generic programming.


any thoughts?

- Krox


Re: toHash /opCmp for builtin-types

2011-02-21 Thread Simon Buerger

On 21.02.2011 21:22, Daniel Gibson wrote:

Am 21.02.2011 20:59, schrieb Simon Buerger:

Following came to my mind while coding some generic collection classes:
The toHash and opCmp operations are not supported for builtin-types
though their implementation is trivial.

* toHash
The code is already there inside TypeInfo.getHash. But
typeid(value).getHash(value) is much uglier than value.toHash. Note
that hashes make sense for integer (trivial implementation), not
necessarily for floats.

* opCmp
Would be useful for delegating opCmp of a struct to one member.
Alternative: Introduce new operator which returns 1/0/-1 (ruby does
this
with =). Currently I end up writing:

int opCmp(...)
{
if(ab) return +1;
if(a==b) return 0;
return -1;
}

which uses 2 comparisons where only 1 is needed (though the compiler
might notice it if comparision is pure and so on).

Furthermore it might me a nice idea to have toString (or the future
writeTo) for builtin-types. It would need some new code in the
core-lib, but could simplify generic programming.

any thoughts?

- Krox


Well, opCmp() can be done easier, at least for ints:

int opCmp(...) {
return a-b;
}


sadly no: (-30) - (30) = -170532704 which is 
incorrect. It does however work for short/byte (and opCmp still 
returning int).



For floats.. well, if you don't want/need any tolerance this would
work as well, else it'd be more difficult.


I'm not sure if it make sense for floats. Furthermore, does 
TypeInfo.getHash support them?



A = operator would be neat, though.




Re: toHash /opCmp for builtin-types

2011-02-21 Thread Simon Buerger

sadly no: (-30) - (30) = -170532704 which is
incorrect. It does however work for short/byte (and opCmp still
returning int).


oops, wrong example. It is: (-20) - (20) = 294967296, 
sry. Anyway, you see the point with overflows


Re: float equality

2011-02-19 Thread Simon Buerger

On 19.02.2011 13:06, spir wrote:

Hello,

What do you think of this?

unittest {
assert(-1.1 + 2.2 == 1.1); // pass
assert(-1.1 + 2.2 + 3.3 == 4.4); // pass
assert(-1.1 + 3.3 + 2.2 == 4.4); // fail
assert(-1.1 + 3.3 == 2.2); // fail
}

There is approxEquals in stdlib, right; but shouldn't builtin == be
consistent anyway?

Denis



It is generally a bad idea to use == with floats, as most decimals 
cant be represented exact in binary floating point. That means there 
is no float for the value 1.1. Though there are exact floats for 
0.25, 0.5, 42.125 and so on. The only reason the first two testcases 
work is, that it is rounded the same way both sides of the == but 
you should not rely on that.


Also note, that these calculations are probably done at compile-time, 
and the compiler is allowed to use a higher precision than at 
run-time, so you might get different result when you let the user 
input the numbers.


- Krox


Re: Number of references to a Class Object

2011-02-12 Thread Simon Buerger

On 12.02.2011 11:47, bearophile wrote:

d coder:


Is there a way fro the users like myself to vote up an issue on DMD Bugzilla.


In this case I think voting is not so useful. I think that actually 
implementing weak references is better (and later they may be added to Phobos). 
It requires some work and knowledge of D and its GC. rebindable() may be used 
as a starting point to copy from...

Bye,
bearophile


Also tango (for D 1.0) implements it.
Link: 
http://www.dsource.org/projects/tango/docs/current/tango.core.WeakRef.html


Might be worth a look if you are going to implement it for D 2.0.

- Krox


Re: Decision on container design

2011-02-01 Thread Simon Buerger

On 01.02.2011 18:08, Steven Schveighoffer wrote:

On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin
michel.for...@michelf.com wrote:


On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


On 1/28/11 8:12 PM, Michel Fortin wrote:

On 2011-01-28 20:10:06 -0500, Denis Koroskin 2kor...@gmail.com
said:


Unfortunately, this design has big issues:
void fill(Appender appender)
{
appender.put(hello);
appender.put(world);
}
void test()
{
Appenderstring appender;
fill(appender); // Appender is supposed to have reference semantics
assert(appender.length != 0); // fails!
}
Asserting above fails because at the time you pass appender
object to
the fill method it isn't initialized yet (lazy initialization). As
such, a null is passed, creating an instance at first appending, but
the result isn't seen to the caller.

That's indeed a problem. I don't think it's a fatal flaw however,
given
that the idiom already exists in AAs.
That said, the nice thing about my proposal is that you can easily
reuse
the Impl to create a new container to build a new container
wrapper with
the semantics you like with no loss of efficiency.
As for the case of Appender... personally in the case above I'd be
tempted to use Appender.Impl directly (value semantics) and make fill
take a 'ref'. There's no point in having an extra heap allocation,
especially if you're calling test() in a loop or if there's a good
chance fill() has nothing to append to it.


I've been thinking of making Appender.Impl public or at least its own
type. A stack-based appender makes a lot of sense when you are using
it temporarily to build an array.

But array-based containers really are in a separate class from
node-based containers. It's tempting to conflate the two because they
are both 'containers', but arrays allow many more
optimizations/features that node-based containers simply can't do.


Yep, yep, I found myself wrestling with the same issues. All good
points. On one hand containers are a target for optimization
because many will use them. On the other hand you'd want to have
reasonably simple and idiomatic code in the container
implementation because you want people to understand them easily
and also to write their own. I thought for a while of a layered
approach in which you'd have both the value and the sealed
reference version of a container... it's just too much aggravation.


But are you not just pushing the aggravation elsewhere? If I need a
by value container for some reason (performance or semantics) I'll
have to write my own, and likely others will write their own too.


foo(container.dup) ; // value semantics

I'm sure some template guru can make a wrapper type for this for the
rare occasions that you need it. We can work on solving the
auto-initialization issue (where a nested container 'just works'), I
think there are ways to do it.

If that still doesn't help for your issues, then writing your own may
be the only valid option.



Using classes for containers is just marginally better than making
them by-value structs: you can use 'new' with a by-value struct if
you want it to behave as a class-like by-reference container:

struct Container {
...
}

auto c = new Container();

The only noticeable difference from a class container is that now c
is now a Container*.


And doesn't get cleaned up by the GC properly. Plus, each member call
must check if the container is 'instantiated', since we can have no
default ctors.

Yes, it's a trade-off, and I think by far class-based containers win
for the common case.


Personally, I'm really concerned by the case where you have a
container
of containers. Class semantics make things really complicated as you
always have to initialize everything in the container explicitly;
value
semantics makes things semantically easier but quite inefficient as
moving elements inside of the outermost container implies copying the
containers. Making containers auto-initialize themselves on first use
solves the case where containers are references-types; making
containers
capable of using move semantics solves the problem for value-type
containers.

Neither values nor references are perfect indeed. For example,
someone mentioned, hey, in STL I write set vectordouble  and it
Just Works(tm). On the other hand, if you swap the two names it
still seems to work but it's awfully inefficient (something that
may trip even experienced developers).


Isn't that solved by C++0x, using move semantics in swap?



swap isn't the problem.

foreach(s; myVectorSet)
{
// if s is by value, it must be copied for each iteration in the loop
}


 -Steve

Just to note: the correct solution for the last problem is

foreach(const ref s; myVectorSet)

which is working in current D. In a more value-based language you may 
even want to default to const ref for foreach-loop-values, and even 
for function-parameters. I suggested that a while ago, but wasn't 
liked much for D, for good reasons.


- Krox



Re: Decision on container design

2011-02-01 Thread Simon Buerger

On 01.02.2011 20:01, Michel Fortin wrote:

On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu
seewebsiteforem...@erdani.org said:


With this, the question becomes a matter of choosing the right
default: do we want values most of the time and occasional
references, or vice versa? I think most of the time you need
references, as witnessed by the many ''s out there in code working
on STL containers.


What exactly is most of the time? In C++, you pass containers by ''
for function parameters, using '' elsewhere is rare.

One thing I proposed some time ago to address this problem (and to
which no one replied) was this:

ref struct Container { ... } // new ref struct concept

void func(Container c) {
// c is implicitly a ref Container
}

Container a; // by value
func(a); // implicitly passed by ref

Containers would be stored by value, but always passed by ref in
functions parameters.



Thats would not be per-value as most understand it.

Container globalC;
func(c);

void func(Container paramC)
{
c.add(42);  // modifies globalC in reference-semantics
// leaves globalC as it was in value-semantics
}


Your Idea would be somehow truly value-based if you default not only 
to ref but to const ref, because then the function would not be 
able to alter globalC. But making parameters default-const was not 
considered the right way for D.


- Krox


Re: Decision on container design

2011-01-31 Thread Simon Buerger

On 31.01.2011 17:53, Steven Schveighoffer wrote:

http://www.dsource.org/projects/dcollections

-Steve


Well, seems not bad on a quick look. But source is updated 2 years 
ago, so I doubt it would compile with current dmd. Anyway, the topic 
here is the std-behaviour of the std-lib. But sure, always nice to 
have custom alternatives.


Krox


Re: Decision on container design

2011-01-31 Thread Simon Buerger
Okay, my fault. Didnt realize you were the author, and the project is 
still active. The 2 years came from here: 
http://www.dsource.org/projects/dcollections/browser/trunk/dcollections. 
I thought, that trunk was the most recent version. Added a bookmark, 
and will definitely take a closer look later. Thx for mentioning.



On 31.01.2011 19:09, Steven Schveighoffer wrote:

On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger k...@gmx.net wrote:


On 31.01.2011 17:53, Steven Schveighoffer wrote:

http://www.dsource.org/projects/dcollections

-Steve


Well, seems not bad on a quick look. But source is updated 2 years
ago, so I doubt it would compile with current dmd. Anyway, the topic
here is the std-behaviour of the std-lib. But sure, always nice to
have custom alternatives.


latest meaningful change was 4 months ago:
http://www.dsource.org/projects/dcollections/changeset/102

It should compile on the latest DMD (if not, file a ticket). BTW, it
changes very little mostly because I haven't had many complaints about
it (maybe not a good sign?) and I haven't had much opportunity to use
it, as my day job does not allow using D unfortunately.

It was proposed as a possibility for the std lib, but Andrei and I
couldn't come to an agreement on what the collections should look
like. Ironically, his latest decision moves std.container closer to
dcollections in design.

However, it should be relatively compatible with phobos' container lib
(in fact RedBlackTree is a direct port of dcollections' version).

-Steve




Re: Decision on container design

2011-01-29 Thread Simon Buerger

On 28.01.2011 19:31, Andrei Alexandrescu wrote:

1. Containers will be classes.

2. Most of the methods in existing containers will be final. It's up
to the container to make a method final or not.

3. Containers and their ranges decide whether they give away
references to their objects. Sealing is a great idea but it makes
everybody's life too complicated. I'll defer sealing to future
improvements in the language and/or the reflection subsystem.

4. Containers will assume that objects are cheap to copy so they won't
worry about moving primitives.


Not perfectly what I would like, but a reasonable choice, and most 
important to actually have a mature container-lib. But there are other 
choices remaining: what containers will be there and what will they be 
called? My suggestion is


* Set, MulitSet, Map, MultiMap (hash-table based)
* OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based)
* Sequence (like stl-Deque. the name is just more intuitive. Funny 
enough, the stl-deque implemenation has nothing to do with a doubly 
linked list)
* Array (like stl-vector. I think vector is a kinda strange name, 
but that may be only my impression)

* List (linked list)

* Stack/Queue/PriorityQueue should be done on top of an other class, 
with a impl-template-param, like the stl-ones


Things to note:
* container should be named with respect to their use, not the 
implementation. HashSet is a bad name, because the user shouldnt 
care about the implemenation.


* unordered sets are used more often than ordered. So it should be 
Set/OrderedSet, and not UnorderedSet/Set (also, the first one is 
two characters less typing *g*)


* opEqual should work between different types of Sets (or Maps, or 
sequences). Nothing wrong with comparing an ordered to an unordered 
one, or a list to an array.


just my 2 cents,
Krox


Re: concatenation

2011-01-24 Thread Simon Buerger

On 25.01.2011 00:22, Robert Clipsham wrote:

If you append something mutable to something immutable, the resulting
type must be mutable, as some of the contents is mutable and could be
changed - if that can happen the result can't be immutable. To get
around this there's .idup I believe.



This is true in a more general sense, but not for 
string-concatenation. The ~ operator always creates a copy of the 
array elements. So even with

y = x ~ ;
y will point to different data than x, so it would be okay to be mutable.

Your statement is however true for arrays of 
Object/Pointer/any-reference-type instead of simple chars.


Krox


Re: repeat

2011-01-17 Thread Simon Buerger

On 18.01.2011 01:24, spir wrote:

On 01/17/2011 07:57 PM, Daniel Gibson wrote:

IMHO * (multiply) is not good because in theoretical computer science
multiply is used to concatenate two words and thus concatenating a word
with itself n times is word^n (pow(word, n) in mathematical terms).


Weird. Excuse my ignorance, but how can multiply even mean concat? How
is this written concretely (example welcome)? Do theoretical computer
science people find this syntax a Good Thing?

Denis


It is that way for formal languages. {a,b}^2 = {aa,ab,ba,bb}, so only 
used for sets of strings. It is logical there because * means 
something like cartesian product which is a concatenation of all 
combinations of both sets. The notation is not that logical for single 
strings though.


Furthermore, + should mean addition, * should mean multiplication. 
None of them should stand for concatenation. Similar argument led to 
the introduction of the ~ operator in D (which is not present in C++/Java)


just my point of view,
Krox


Re: improvement request - enabling by-value-containers

2010-12-22 Thread Simon Buerger

On 21.12.2010 18:45, Bruno Medeiros wrote:

On 09/12/2010 21:55, Simon Buerger wrote:

From a pragmatic viewpoint you are right, copying containers is rare.
But on the other hand, classes imply a kind of identity, so that a set
is a different obejct then an other object with the very same elements.


Yeah, classes have identity, but they retain the concept of equality.
So what's wrong with that? Equality comparisons would still work the
same way as by-value containers.


Identity is wrong, because if I pass th set {1,2,3} to a function, I 
would like to pass exactly these three values, not some mutable 
object. This may imply that the function-parameter should be const, 
which is probably a good idea anyway. I want it to be mutable, I want 
to use out/ref, the same way as with the simple builtin-types.



That feels wrong from an aesthetical or mathematical viewpoint.


Aesthetics are very subjective (I can say the exact same thing about
the opposite case). As for a mathematical viewpoint, yes, it's not
exactly the same, but first of all, it's not generally a good idea to
strictly emulate mathematical semantics in programming languages. So
to speak, mathematical objects are immutable, and they exist in a
magical infinite space world without the notion of execution or
side-effects. Trying to model those semantics in a programming
language brings forth a host issues (most of them
performance-related). But more important, even if you wanted to do
that (to have it right from a mathematical viewpoint), mutable
by-value containers are just as bad, you should use immutable data
instead.


You might be right that modeling mathematics is not perfect, at least 
in C/C++/D/java. Though the functional-programming is fine with it, 
and it uses immutable data just as you suggested. But I'm aware that 
thats not the way to go for D.


Anyway, though total math-like behavior is impossible, but with
auto A = Set(1,2,3);
auto B = A;
B.add(42);
letting A and B have different contents is much closer to math, than 
letting both be equal. Though both is not perfect.


And for the immutable data: Its not perfectly possible, but in many 
circumstances it is considered good style to use const and 
assumeUnique as much as possible. It helps optimizing, 
multi-threading and code-correctness. So it is a topic not only in 
functional programming but also in D.



Furthermore, if you have for example a vector of vectors,

vector!int row = [1,2,3];
auto vec = Vector!(Vector!int)(5, row);

then vec should be 5 rows, and not 5 times the same row.



Then instead of Vector use a static-length vector type, don't use a
container.


Maybe you want to change that stuff later on, so static-length is no 
option. Following example might demonstrate the problem more clearly. 
It is intended to init a couple of sets to empty.


set!int[42] a;

version(by_reference_wrong):
a[] = set!int.empty;// this does not work as intended

version(by_reference_correct):
foreach(ref x; a)
x = set!int.empty;

version(by_value):
//nothing to be done, already everything empty

Obviously the by_value version is the cleanest. Furthermore, the first 
example demonstrates that by-reference does not work together with the 
slice-syntax (which is equivalent to the constructor-call in my 
original example). Replacing set!int.empty with new set!int doesnt 
change the situation, but make it sound only more weird in my ears: 
new vector? what was wrong with the old one? and I dont want _an_ 
empty set, I want _the_ empty set. Every empty set is equal, so 
there is only one.


Last but not least let me state: I do _not_ think, that 
value-containers will go into phobos/tango some day, that would to 
difficult in practice. I just want to state that there are certain 
reasons for it. (And originally this thread asked for some small 
changes in the language to make it possible, not the standard).


Krox

ps: I'll go on vacation now, see you next year, if there is still need 
for discussion. Merry christmas all :)




Re: Paralysis of analysis

2010-12-14 Thread Simon Buerger

On 14.12.2010 20:02, Andrei Alexandrescu wrote:

I kept on literally losing sleep about a number of issues involving
containers, sealing, arbitrary-cost copying vs. reference counting and
copy-on-write, and related issues. This stops me from making rapid
progress on defining D containers and other artifacts in the standard
library.

Clearly we need to break this paralysis, and just as clearly whatever
decision taken now will influence the prevalent D style going forward.
So a decision needs to be made soon, just not hastily. Easier said
than done!

I continue to believe that containers should have reference semantics,
just like classes. Copying a container wholesale is not something you
want to be automatic.

I also continue to believe that controlled lifetime (i.e.
reference-counted implementation) is important for a container.
Containers tend to be large compared to other objects, so exercising
strict control over their allocated storage makes a lot of sense. What
has recently shifted in my beliefs is that we should attempt to
implement controlled lifetime _outside_ the container definition, by
using introspection. (Currently some containers use reference counting
internally, which makes their implementation more complicated than it
could be.)

Finally, I continue to believe that sealing is worthwhile. In brief, a
sealing container never gives out addresses of its elements so it has
great freedom in controlling the data layout (e.g. pack 8 bools in one
ubyte) and in controlling the lifetime of its own storage. Currently
I'm not sure whether that decision should be taken by the container,
by the user of the container, or by an introspection-based wrapper
around an unsealed container.

* * *

That all being said, I'd like to make a motion that should simplify
everyone's life - if only for a bit. I'm thinking of making all
containers classes (either final classes or at a minimum classes with
only final methods). Currently containers are implemented as structs
that are engineered to have reference semantics. Some collections use
reference counting to keep track of the memory used.

Advantages of the change:

- Clear, self-documented reference semantics

- Uses the right tool (classes) for the job (define a type with
reference semantics)

- Pushes deterministic lifetime issues outside the containers
(simplifying them) and factors such issues into reusable wrappers a la
RefCounted.

Disadvantages:

- Containers must be dynamically allocated to do anything - even
calling empty requires allocation.

- There's a two-words overhead associated with any class object.

- Containers cannot do certain optimizations that depend on
container's control over its own storage.


What say you?

Andrei


I continue to belief, that containers should be value-types. In order 
to prevent useless copying you can use something like Impl * impl 
and reference-counting. Then you only do a copy on actual change. This 
is the way I'm currently implementing in my own container-classes.


But I see the point in making them reference-types, because copying is 
so rare in real world. Though I find the expression new Set() most 
strange, you are definitlely right in the following: If you make them 
reference-types, they should be classes, not structs (and final, to 
prevent strange overloading).


Krox


Re: Paralysis of analysis

2010-12-14 Thread Simon Buerger

On 14.12.2010 20:53, Andrei Alexandrescu wrote:

Coming from an STL background I was also very comfortable with the
notion of value. Walter pointed to me that in the STL what you worry
about most of the time is to _undo_ the propensity of objects getting
copied at the drop of a hat. For example, think of the common n00b
error of passing containers by value.


True thing, C++/STL does much work to prevent the copy-mechanism, but 
it can be circumvented by using the indirection+refCount trick. Than 
it doesnt matter how you pass it, it gets copied layzily when the 
first actual change occurs. That places some overhead

1) increasing/decrasing refcount on every argument-passing
2) checking for refCount1 on every modifying method-call (not on the 
reading methods)


I'm pretty sure (1) is insignificand. (2) I'm not sure about. For a 
very simple list-container it might be a problem, but for 
sophisticated structures like hashtables or trees this one check is 
probably insignificand.




So since we have the opportunity to decide now for eternity the right
thing, I think reference semantics works great with containers.


Indeed. Whichever way to go, you need a good reason. I hope, a similar 
discussion will be placed for the actual interface of the 
container-lib. (Which template-params should there be? T, Allocator, 
Comp are the three most classic ones, but more or less is possible, 
and what kinds of containers should be there at all?. Anyway, doesnt 
belong here now).


Krox


Re: improvement request - enabling by-value-containers

2010-12-09 Thread Simon Buerger

On 08.12.2010 23:45, Jonathan M Davis wrote:

On Wednesday, December 08, 2010 14:14:57 Simon Buerger wrote:

For Every lib its a design descision if containers should be value- or
reference-types. In C++ STL they are value-types (i.e. the
copy-constructor does a real copy), while in tango and phobos the
descision was to go for reference-types afaik, but I would like to be
able to write value-types too, which isn't possible (in a really good
way) currently. Following points would need some love (by-value
containers are probably not the only area, where these could be useful)


It's extremely rare in my experience that it makes any sense to copy a container
on a regular basis. Having an easy means of creating a deep copy of a container
or copying the elements from one container to another efficiently would be good,
but having containers be value types is almost always a bad idea. It's just not
a typical need to need to copy containers - certainly not enough to have them be
copied just because you passed them to a function or returned them from one. I
think that reference types for containers is very much the correct decision.
There should be good ways to copy containers, but copying shouldn't be the
default for much of anything in the way of containers.


From a pragmatic viewpoint you are right, copying containers is rare. 
But on the other hand, classes imply a kind of identity, so that a set 
is a different obejct then an other object with the very same 
elements. That feels wrong from an aesthetical or mathematical 
viewpoint. Furthermore, if you have for example a vector of vectors,


vector!int row = [1,2,3];
auto vec = Vector!(Vector!int)(5, row);

then vec should be 5 rows, and not 5 times the same row.


(1) Allow default-constructors for structs
I don't see a reason, why this(int foo) is allowed, but this() is
not. There might be some useful non-trivial init to do for complex
structs.


It has to do with the init property. It has to be known at compile-time for all
types. For classes, that's easy because it's null, but for structs, that's what
all of their member variables are directly initialized to. If you add a default
constructor, then it would have to be to whatever that constructed them to,
which would shift it from compile time to runtime. It should be possible to have
default constructors which are definitely limited in a number of ways (like
having to be nothrow and possibly pure), but that hasn't been sorted out, and
even if it is, plenty of cases where people want default constructors still
wouldn't likely work. It just doesn't work to have default constructors which
can run completely arbitrary code. You could get exceptions thrown in weird
places and a variety of other problems which we can't have in situations where
init is used. Hopefully, we'll get limited default constructors at some point,
but it hasn't happened yet (and probably won't without a good proposal that
deals with all of the potentiall issues), and regardless, it will never be as
flexible as what C++ does. It's primarily a side effect of insisting that all
variables be default initialized if they're not directly initialized.


I partially see your point, the constructor would be called in places 
the programmer didnt expect, but actually, what's the problem with an 
exception? They can always happen anyway (at least outOfMemory)



(2) const parameters by reference
If a parameter to a function is read-only, the right notion depends on
the type of that parameter. I.e. in for simple stuff like ints, and
ref const for big structures. Using in for big data implies a
whole copy, even though it's constant, and using ref const for
simple types is a useless indirection. This is a problem for generic
code, when the type is templated, because there is now way to switch
between in and ref const with compile-time-reflection.

Solution one: make ref a real type-constructor, so you could do the
following (this is possible in C++):

static if(is(T == struct))
alias ref const T const_type;
else
alias const scope T const_type;
// const scope is (currently) equivalent to in
void foo(const_type x)

Solution two: let in decide wheather to pass by reference or value,
depending on the type. Probably the better solution cause the
programmer dont need to care of the descision himself anymore.


I think that auto ref is supposed to deal with some of this, but it's buggy at
the moment, and I'm not sure exactly what it's supposed to do. There was some
discussion on this one in a recent thread.


letting in decide would be cleaner IMO, but anyway good to hear that 
problem is recognized. Will look for the other thread.



(3) make foreach parameters constant
when you do foreach(x;a) the x value gets copied in each iteration,
once again, that matters for big types especially when you have a
copy-constructor. Current work-around is prepending ref: nothing
gets copied, but the compiler wont know it is meant to be read-only.
Solution

Re: improvement request - enabling by-value-containers

2010-12-09 Thread Simon Buerger

On 09.12.2010 23:39, Jesse Phillips wrote:

Simon Buerger Wrote:


vector!int row = [1,2,3];
auto vec = Vector!(Vector!int)(5, row);

then vec should be 5 rows, and not 5 times the same row.


Why? You put row in there and said there was 5 of them.

vec[] = row.dup;
I believe that would be the correct syntax if you wanted to store 5 different 
vectors of the same content (Works for arrays).


No, that line would duplicate row once, and store that same copy in 
every element of vec.



I partially see your point, the constructor would be called in places
the programmer didnt expect, but actually, what's the problem with an
exception? They can always happen anyway (at least outOfMemory)


I think there is even more too it. init is used during compile time so 
properties of the class/struct can be checked. I don't think exceptions are 
supported for CTFE.


letting in decide would be cleaner IMO, but anyway good to hear that
problem is recognized. Will look for the other thread.


I'm not sure if the spec says in must be passed by reference, only that is how 
it is done. I'd think it'd be up to the compiler.


Other way around, in is currently passed by value, though the spec 
does not explicitly disallow by reference, so it might be implemented 
without even changing the spec.



You are right that default-const would be contrary to the rest of the
language, but when I think longer about this... the same default-const
should apply for all function parameter. They should be input, output
or inout. But the mutable copy of the original which is common in
C/C++/D/everything-alike, is actually pretty weird. (modifying
non-output parameters inside a function is considered bad style even
in C++ and Java). But well, that would be really a step too big for
D2... maybe I'll suggest it for D3 some day *g*


I believe Bearophile has beat you too that. Think it is even in Bugzilla. I 
think it would only make sense to add it to D3 if it becomes common to mark 
functions parameters as in. But I agree it is easier to think, I want to modify 
this then it is to say I'm not modifying this so it should be in. Though 
currently I don't think there is a way to mark the current default behavior.


Well, it would be good style to add in/out to each and every parameter 
there is, but I dont do it myself either (except in 
container-implementations, where good style and the last bit of 
optimizer seems important)


Krox


improvement request - enabling by-value-containers

2010-12-08 Thread Simon Buerger
For Every lib its a design descision if containers should be value- or 
reference-types. In C++ STL they are value-types (i.e. the 
copy-constructor does a real copy), while in tango and phobos the 
descision was to go for reference-types afaik, but I would like to be 
able to write value-types too, which isn't possible (in a really good 
way) currently. Following points would need some love (by-value 
containers are probably not the only area, where these could be useful)


(1) Allow default-constructors for structs
I don't see a reason, why this(int foo) is allowed, but this() is 
not. There might be some useful non-trivial init to do for complex 
structs.


(2) const parameters by reference
If a parameter to a function is read-only, the right notion depends on 
the type of that parameter. I.e. in for simple stuff like ints, and 
ref const for big structures. Using in for big data implies a 
whole copy, even though it's constant, and using ref const for 
simple types is a useless indirection. This is a problem for generic 
code, when the type is templated, because there is now way to switch 
between in and ref const with compile-time-reflection.


Solution one: make ref a real type-constructor, so you could do the 
following (this is possible in C++):


static if(is(T == struct))
alias ref const T const_type;
else
alias const scope T const_type;
// const scope is (currently) equivalent to in
void foo(const_type x)

Solution two: let in decide wheather to pass by reference or value, 
depending on the type. Probably the better solution cause the 
programmer dont need to care of the descision himself anymore.


(3) make foreach parameters constant
when you do foreach(x;a) the x value gets copied in each iteration, 
once again, that matters for big types especially when you have a 
copy-constructor. Current work-around is prepending ref: nothing 
gets copied, but the compiler wont know it is meant to be read-only. 
Solution: either allow ref const or in in foreach. Or you could 
even make x default to constant if not stated as ref explicitly. 
Last alternative seems logical to me, but it may break existing code.


Comments welcome,
Krox