date:20091026


Christopher Wright wrote:
PS: I wonder, should the runtime really execute finally blocks if an 
Error exception is thrown? (Errors are for runtime errors, Exception 
for normal exceptions.) Isn't it dangerous to execute arbitrary user 
code in presence of what is basically an internal error?


Are all Errors unrecoverable except by immediately aborting the 
application?


What about logging?

What about putting up a reasonable error message for the user?

What about restarting the failed module in case the issue was temporary 
and environmental?


Something is wrong with your program internally if something like this 
happens. You can't expect a consistent program state. And most of the 
code in finally blocks was not written for such situations. You'll 
probably end up throwing another runtime error from within a finally block.


If you really want reliability, you should terminate the process 
immediately, and check the situation from another process.

Re: Disallow catch without parameter (LastCatch)


Leandro Lucarella wrote:

grauzone, el 25 de octubre a las 12:09 me escribiste:

Right now, you can catch every exception with try { something; }
catch { somethingelse; }.

Can we get rid of this abomination before D2 is finalized? I claim
that it's completely useless, and even more so, every single use of
this is a bug.


Why every use of this is a bug?


Because you most likely catch more exception types than you really want. 
For example, a novice programmer is likely to write something like this:


int x;
try {
 x = convert(somestring);
} catch {
 //convert throws some exception if convert() fails
 return -1;
}

This is a bug, because catch might catch and *hide* a runtime error like 
failing assertions. The programmer really wanted to write catch 
(ConversionException e). Even if he wrote catch (Exception e), he 
wouldn't catch runtime errors, and the code would be safe.

Re: No header files?

2009-10-26 Thread Walter Bright


Lionello Lunesu wrote:
There's no need for a new format: D's name mangling has all information 
necessary to reconstruct a function's full signature.


There's a lot more needed than just function signatures and vtbl 
layouts. Templates, enums, consts, struct/class fields, interfaces, etc.


Perhaps it's my Pascal background, but I miss how I did not have to 
worry about include paths and all that then, and I like how I don't have 
to worry about it when using C# now.


Somehow, you're going to have to tell the build system which files to use.

The bizarre world of typeof()

2009-10-26 Thread Don

I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four simple 
cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;

// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;

// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;

// And this fails on both D1 and D2, with a dreadful error message.
//alias typeof(foo(int)) Alias4;

I can't see anything in the spec to say why ANY of these examples should 
compile. Yet, the existing template constraints features relies on the 
Alias3 case.


I can see two ways forward:
(1) enforce the existing spec. Make all uses of types as expressions 
into a bug. This will break a lot of existing code, including several in 
the DMD test suite!
You'd generally need to include a .init whenever using a type inside a 
typeof(). This would make some code a lot uglier.
I'm also not sure what happens with alias parameters. (If A is an alias 
to a type, then typeof(A*A) should be changed to typeof(A.init*A.init); 
but if it's an alias to a variable, it should remain as typeof(A*A)).


(2) Define that, inside a typeof() expression, any type T is translated 
into T.init. The syntax for typeof() would need to be changed, in order 
to allow the case 'alias1'.


Note, however, that in both cases there's no such thing as .init for 
tuples; it might need to be added.


Behaviour (2) is probably more convenient, behaviour (1) is easier to 
justify. But I think the existing behaviour of typeof() doesn't make 
much sense.

Re: Restricting ++ and --

bearophile Wrote:

 Removing those operators from D, as Python, may look excessive. So a possible 
 compromise can be:
 - Deprecate the pre versions:  --x  and ++x
 - Make them return void, so they can't be used as expressions like this:
 y = x++;
 foo(x--);
 You have to use them as:
 x++; y = x;
 x--; foo(x);

int PreInc(ref int i){ i++; return i; }
int PostInc(ref int i){ i++; return i-1; }
y=PreInc(i);
y=PostInc(i);

just a little more difficult.

y=x=0;
Ever wondered what opAssign returns?

Re: Restricting ++ and --

bearophile Wrote:

 Python designers have totally avoided to put those operators in the language, 
 with the rationale they are bug-prone while saving just a little of typing.

In D you don't actually have to use them: you have ranges, foreaches etc.

Re: [OT] What should be in a programming language?

2009-10-26 Thread bearophile

Yigal Chripun:

 D has many (sometimes very powerful) hacks that could and should be 
 replaced with a much better general solution.

Sometimes fully orthogonal designs aren't the best because they force the 
programmer to do too much assembly required. A good language has to offer 
some handy pre-built/highlevel constructs for the most common operations, that 
can save lot of work. Sometimes general solutions produce slow programs, or 
they require a more complex/smarter/slower compiler (I need a new word there, 
complexer? Natural languages too need updates now and then).
But I agree that in few parts D can enjoy a redesign. Andrei is doing some work 
on this, and quite more work can be done. The alias this looks like a little 
hack too me too. A safer or more theoretically sound solution may be found.

Bye,
bearophile

Re: Restricting ++ and --

2009-10-26 Thread bearophile

Kagamin:

 int PreInc(ref int i){ i++; return i; }
 int PostInc(ref int i){ i++; return i-1; }
 y=PreInc(i);
 y=PostInc(i);
 
 just a little more difficult.

That's a vote for my proposal then, because you have shown a way to do the same 
thing for people that really need to do it.

In D there are often ways to do everything. To help avoid bugs there's no need 
to totally remove all ways to do something, but to turn them into something 
explicit and under programmer's control. The idea is to turn places where 
possible bugs can be hidden into something that can be seen.

Bye,
bearophile

Re: Restricting ++ and --

2009-10-26 Thread bearophile

Kagamin:
 In D you don't actually have to use them: you have ranges, foreaches etc.

Very good, then restricting them will cause no problems.

Bye,
bearophile

Re: Private enum members + Descent


Yigal Chripun wrote:
personally I'd like to see D enums replaced by Java style enums which 
make more sense to me. D enums are even worse than C enums since you can 
write:

enum foo = text;

which to me looks very similar to:
auto cat = new Dog;



I agree that enum is a horrible keyword to use for declaring manifest 
constants. In my opinion the D developers are sometimes a bit too afraid 
of introducing new keywords, and this is one of the consequences.


Personally, I think this would be a better scheme:

const: manifest constants, no storage (like const in D1, enum in D2)
 readonly: used for a read-only view of mutable data (like const in D2)
immutable: truly immutable data (like now)

-Lars

Re: The bizarre world of typeof()

2009-10-26 Thread Ary Borenszweig


Don wrote:
I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four simple 
cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;


Not valid: typeof accepts an expression and int*int is not a valid 
expression.




// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;


Almost same as above: Int resolves to a type and type*type is not a 
valid expression.




// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;


Of course, because this doesn't translate to Int*Int, this translates 
to some variables x and y of type Int and Int respectively for which 
you can do x*y.

Re: The bizarre world of typeof()


Don wrote:
I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four simple 
cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;

// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;

// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;

// And this fails on both D1 and D2, with a dreadful error message.
//alias typeof(foo(int)) Alias4;

I can't see anything in the spec to say why ANY of these examples should 
compile. Yet, the existing template constraints features relies on the 
Alias3 case.



Here are a few more:

class Foo { real bar() { return 1.0; } }
Foo foo = new Foo;

// Passes, but should fail.
static assert (is (typeof(foo.bar) == function));

// Passes, as expected.
static assert (is (typeof(foo.bar) == delegate));

// Passes, but should fail. This is similar to Don's examples.
static assert (is (typeof(Foo.bar) == function));

// This one fails with the following hilarious message:
// Error: static assert  (is(real function() == function)) is false
static assert (is (typeof(Foo.bar) == function));

I have no idea why typeof(Foo.bar) even works, but it does. Foo is 
completely meaningless.




I can see two ways forward:
(1) enforce the existing spec. Make all uses of types as expressions 
into a bug. This will break a lot of existing code, including several in 
the DMD test suite!
You'd generally need to include a .init whenever using a type inside a 
typeof(). This would make some code a lot uglier.
I'm also not sure what happens with alias parameters. (If A is an alias 
to a type, then typeof(A*A) should be changed to typeof(A.init*A.init); 
but if it's an alias to a variable, it should remain as typeof(A*A)).


(2) Define that, inside a typeof() expression, any type T is translated 
into T.init. The syntax for typeof() would need to be changed, in order 
to allow the case 'alias1'.


Note, however, that in both cases there's no such thing as .init for 
tuples; it might need to be added.


Behaviour (2) is probably more convenient, behaviour (1) is easier to 
justify. But I think the existing behaviour of typeof() doesn't make 
much sense.



I vote for (1). There should be as few special cases in the language 
as possible.


-Lars

Re: The bizarre world of typeof()

2009-10-26 Thread Don


Ary Borenszweig wrote:

Don wrote:
I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four 
simple cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;


Not valid: typeof accepts an expression and int*int is not a valid 
expression.


Agreed.




// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;


Almost same as above: Int resolves to a type and type*type is not a 
valid expression.


Agreed.



// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;


Of course, because this doesn't translate to Int*Int, this translates 
to some variables x and y of type Int and Int respectively for which 
you can do x*y.

How does it get from Int to an instance of type Int?
The first issue is typeof( T * U ). T and U are not variables, they are 
types.

Re: Restricting ++ and --

2009-10-26 Thread d-noob

bearophile Wrote:

 Kagamin:
  In D you don't actually have to use them: you have ranges, foreaches etc.
 
 Very good, then restricting them will cause no problems.

He probably meant that the idea was to leave those parts of the language 
untouched that don't really matter to you. You can write pythonese d just like 
before and the rest of use can write in good old C/C++ fashion.

Re: Private enum members

Justin Johansson Wrote:

 enum Color {
   private UNINITIALIZED = -1,
   RED, GREEN, BLUE
 }

It's syntactical ambiguity, I think. There's no much difference between
private UNINITIALIZED = -1,
RED, GREEN, BLUE

and

private UNINITIALIZED = -1, RED, GREEN, BLUE

Re: Locally Instantiated Templates

Ellery Newcomer Wrote:

 Does local instantiation just mean the template instance is located in
 the stack (heap?) frame, and otherwise is conceptually the same as any
 other template instance? e.g. same scoping rules?

templates have no storage. Template is just a semantical construct for 
parameterizing types. Template members (variables, functions, types) may have 
storage.

Re: Private enum members

2009-10-26 Thread d-noob

Kagamin Wrote:

 Justin Johansson Wrote:
 
  enum Color {
private UNINITIALIZED = -1,
RED, GREEN, BLUE
  }
 
 It's syntactical ambiguity, I think. There's no much difference between
 private UNINITIALIZED = -1,
 RED, GREEN, BLUE
 
 and
 
 private UNINITIALIZED = -1, RED, GREEN, BLUE

So let's fix it:

enum C {
  private:
  UNINITIALIZED = -1;
  public:
  RED, GREEN, BLUE;
}

But even now it's easy to set illegal values in client code:

C c = RED;
c--; // bang!

Gotta love the implicit conversions - this is almost like pointer arithmetics

Re: [OT] What should be in a programming language?

Yigal Chripun Wrote:

 for instance there's special handling of void return types so it would 
 be easier to work with in generic code. instead of this compiler hack a 
 much simpler solution is to have a unit type and ditch C style void. the 
 bottom type should also exist mainly for completeness and for a few 
 stdlib functions like abort() and exit()

uint and void return types may be nearly equivalent for x86 architecture, CLI 
makes strong difference between them.

Re: Private enum members

d-noob Wrote:

 So let's fix it:
 
 enum C {
   private:
   UNINITIALIZED = -1;
   public:
   RED, GREEN, BLUE;
 }

See enums with virtual methods
http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html

Re: Restricting ++ and --

bearophile Wrote:

 Kagamin:
  In D you don't actually have to use them: you have ranges, foreaches etc.
 
 Very good, then restricting them will cause no problems.

Why do you use feature if you feel uncomfortable with it? If I'm not familiar 
with template voodoo, should I write TMP library and debug it?

Re: The bizarre world of typeof()

Lars T. Kyllingstad Wrote:

  // This one fails with the following hilarious message:
  // Error: static assert  (is(real function() == function)) is false
  static assert (is (typeof(Foo.bar) == function));

Failure is valid, compiler just can't show member function types correctly.

GC Precision

I just realized last night that D's templates are probably powerful enough now
to generate bit masks that can be used for precise GC heap scanning.  I'm
halfway (emphasis on halfway) thinking of using this to try to hack the GC and
make heap scanning fully precise except for the corner case of unions.
However, this ties into several things that others in the D community are
doing, so I want to gauge people's responses and make sure I'm not wasting
effort on something that will be useless in 6 months.

1.  Sean, Leonardo, whoever else may be working on GC implementations, have
you by any chance broken ground on precise heap scanning already?

2.  Andrei, Walter, how close are we to actually eliminating new from the
language?  If all allocations were done by either calling GC.malloc() or using
templates that call GC.malloc(), then things would get a lot simpler than if I
were to have to hack the compiler to make new pass type info to the GC.

3.  I'm thinking bit masks could be stored as follows:

When getBitMask!(T) is instantiated, it generates an immutable size_t[N].
Element 0 is the size of the array (to allow for storing only the ptr in the
GC), element 1 is the size of one instance of the object, in bytes.  The size
of the memory block must be a multiple of this.  Elements 2..$ are all of the
offsets that should be scanned for pointers.  For example:

struct Foo {
uint bar;
void* baz;
}

getBitMask!(Foo);  // [3, 8, 4].

That leaves the problem of where/how to store the pointers to this information
in the GC efficiently.  I haven't gotten that far yet, but I remember some
concerns have been raised in the past about storing 4 bytes per GC object for
a pointer to the bitmask.  For my use cases, where I tend to allocate a
relatively small number of relatively large objects, this isn't a problem.
However, in a heavily OO environment, where people allocate tons of tiny
objects, it might be.

Thread-local storage and Performance

Has D's builtin TLS been optimized in the past 6 months to year?  I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind that are
now __gshared).  Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than __gshared.
 What's changed?

Re: The bizarre world of typeof()


Kagamin wrote:

Lars T. Kyllingstad Wrote:


 // This one fails with the following hilarious message:
 // Error: static assert  (is(real function() == function)) is false
 static assert (is (typeof(Foo.bar) == function));


Failure is valid, compiler just can't show member function types correctly.



I'm not saying it should compile, I'm saying that the compiler should 
give an error when it encounters the expression Foo.bar, and not just 
because of the failed assertion. It's bad enough that it accepts Foo.bar 
(this is what Don was talking about), but allowing one to take the 
address as well is just nonsense -- even when it's in an is(typeof()) 
expression.


In fact, Foo.bar actually returns an address. The following compiles:

class Foo { real bar() { return 1.0; } }
auto f = Foo.bar;
auto x = f();

Of course, when run, it segfaults on the last line. I wonder where f 
actually points to.


-Lars

Re: The bizarre world of typeof()

Lars T. Kyllingstad Wrote:

 I'm saying that the compiler should 
 give an error when it encounters the expression Foo.bar
why?

Re: The bizarre world of typeof()


Kagamin wrote:

Lars T. Kyllingstad Wrote:

I'm saying that the compiler should 
give an error when it encounters the expression Foo.bar

why?



What does it mean?

-Lars

Re: The bizarre world of typeof()


Lars T. Kyllingstad wrote:

Kagamin wrote:

Lars T. Kyllingstad Wrote:


 // This one fails with the following hilarious message:
 // Error: static assert  (is(real function() == function)) is false
 static assert (is (typeof(Foo.bar) == function));


Failure is valid, compiler just can't show member function types 
correctly.



I'm not saying it should compile, I'm saying that the compiler should 
give an error when it encounters the expression Foo.bar, and not just 
because of the failed assertion. It's bad enough that it accepts Foo.bar 
(this is what Don was talking about), but allowing one to take the 
address as well is just nonsense -- even when it's in an is(typeof()) 
expression.


In fact, Foo.bar actually returns an address. The following compiles:

class Foo { real bar() { return 1.0; } }
auto f = Foo.bar;
auto x = f();

Of course, when run, it segfaults on the last line. I wonder where f 
actually points to.


We need that to get the address of a function. It's just that the type 
of the returned object is a bit bogus: it's not really a function; it's 
a method pointer casted to a function pointer. It simply has the wrong 
calling convention.


We also need typeof(Foo.bar) to get the parameter and return types for 
the bar method.



-Lars

Re: The bizarre world of typeof()

2009-10-26 Thread Denis Koroskin


On Mon, 26 Oct 2009 16:38:42 +0300, Kagamin s...@here.lot wrote:


Lars T. Kyllingstad Wrote:


I'm saying that the compiler should
give an error when it encounters the expression Foo.bar

why?


This is somewhat invalid expression without a context:

class Foo
{
int bar() { return 42; }
}

auto dg1 = Foo.bar; // fine?
int x = dg1(); // fine, too?

But sometimes context is implicit:

class Derived : Foo
{
auto get()
{
return Foo.bar; // context is this and is implicit
}
}

auto dg2 = (new Derived()).get();
int y = dg2(); // okay



Also sometimes you need a function and don't care about context:

void delegate(int x) dg;
dg.funcptr = Foo.bar;
dg.ptr = new Foo();

int z = dg(); // also fine

So Foo.bar should stay.


Lars T. Kyllingstad Wrote:


 // This one fails with the following hilarious message:
 // Error: static assert  (is(real function() == function)) is false
 static assert (is (typeof(Foo.bar) == function));


Failure is valid, compiler just can't show member function types  
correctly.


That's a correct behavior. Foo.bar returns a *function*, not a delegate  
(because of a lack of context), and  no type information is associated  
with it.


Well, it's could return a delegate (with a null context), too, but then  
reassigning delegate.funcptr would be less obvious:


dg.funcptr = (Foo.bar).funcptr;


And it won't save you from invoking a delegate with a missing context:

int delegate() dg = Foo.bar;
dg(); // oops!

Re: The bizarre world of typeof()

2009-10-26 Thread Don


Lars T. Kyllingstad wrote:

Don wrote:
I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four 
simple cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;

// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;

// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;

// And this fails on both D1 and D2, with a dreadful error message.
//alias typeof(foo(int)) Alias4;

I can't see anything in the spec to say why ANY of these examples 
should compile. Yet, the existing template constraints features relies 
on the Alias3 case.



Here are a few more:

class Foo { real bar() { return 1.0; } }
Foo foo = new Foo;

// Passes, but should fail.
static assert (is (typeof(foo.bar) == function));

// Passes, as expected.
static assert (is (typeof(foo.bar) == delegate));

// Passes, but should fail. This is similar to Don's examples.
static assert (is (typeof(Foo.bar) == function));

// This one fails with the following hilarious message:
// Error: static assert  (is(real function() == function)) is false
static assert (is (typeof(Foo.bar) == function));

I have no idea why typeof(Foo.bar) even works, but it does. Foo is 
completely meaningless.


Dot has higher precedence than , so it means (Foo.bar), not (Foo).bar.

The static assert fails because real function() is a *function 
pointer*, but is(xxx == function) tests to see if xxx is a *function*, 
not a *function pointer*.


So this passes:
void function () goo;
static assert( is (typeof(*goo) == function));

It's pretty awful that that in is(real function() == function), the 
keyword 'function' has two contradictory meanings in the same 
expression. And the spec never says what function type is.



I can see two ways forward:
(1) enforce the existing spec. Make all uses of types as expressions 
into a bug. This will break a lot of existing code, including several 
in the DMD test suite!
You'd generally need to include a .init whenever using a type inside a 
typeof(). This would make some code a lot uglier.
I'm also not sure what happens with alias parameters. (If A is an 
alias to a type, then typeof(A*A) should be changed to 
typeof(A.init*A.init); but if it's an alias to a variable, it should 
remain as typeof(A*A)).


(2) Define that, inside a typeof() expression, any type T is 
translated into T.init. The syntax for typeof() would need to be 
changed, in order to allow the case 'alias1'.


Note, however, that in both cases there's no such thing as .init for 
tuples; it might need to be added.


Behaviour (2) is probably more convenient, behaviour (1) is easier to 
justify. But I think the existing behaviour of typeof() doesn't make 
much sense.


I vote for (1). There should be as few special cases in the language 
as possible.


-Lars

Re: [OT] What should be in a programming language?

2009-10-26 Thread Jason House

Kagamin Wrote:

 Yigal Chripun Wrote:
 
  for instance there's special handling of void return types so it would 
  be easier to work with in generic code. instead of this compiler hack a 
  much simpler solution is to have a unit type and ditch C style void. the 
  bottom type should also exist mainly for completeness and for a few 
  stdlib functions like abort() and exit()
 
 uint and void return types may be nearly equivalent for x86 architecture, CLI 
 makes strong difference between them.


Are you two talking about the same thing? uint and unit are quite different 
from each other. My understanding from scala is that most/all uses of unit are 
optimized away. I still don't know what unit holds...

Re: GC Precision

2009-10-26 Thread Sean Kelly

dsimcha Wrote:

 I just realized last night that D's templates are probably powerful enough now
 to generate bit masks that can be used for precise GC heap scanning.  I'm
 halfway (emphasis on halfway) thinking of using this to try to hack the GC and
 make heap scanning fully precise except for the corner case of unions.
 However, this ties into several things that others in the D community are
 doing, so I want to gauge people's responses and make sure I'm not wasting
 effort on something that will be useless in 6 months.
 
 1.  Sean, Leonardo, whoever else may be working on GC implementations, have
 you by any chance broken ground on precise heap scanning already?

I've thought about it, but not done anything about it.  The compiler doesn't 
provide this information, so precise scanning would require a user-level call.  
You'll also have to deal with arrays of structs, by the way.

Re: GC Precision

== Quote from Sean Kelly (s...@invisibleduck.org)'s article
 dsimcha Wrote:
  I just realized last night that D's templates are probably powerful enough 
  now
  to generate bit masks that can be used for precise GC heap scanning.  I'm
  halfway (emphasis on halfway) thinking of using this to try to hack the GC 
  and
  make heap scanning fully precise except for the corner case of unions.
  However, this ties into several things that others in the D community are
  doing, so I want to gauge people's responses and make sure I'm not wasting
  effort on something that will be useless in 6 months.
 
  1.  Sean, Leonardo, whoever else may be working on GC implementations, have
  you by any chance broken ground on precise heap scanning already?
 I've thought about it, but not done anything about it.  The compiler doesn't
provide this information, so precise scanning would require a user-level call.
You'll also have to deal with arrays of structs, by the way.

Arrays of structs are easy:  Generate a bitmask for one element, and keep 
reusing
that bitmask until the end of the block.  Am I missing something?

Re: The bizarre world of typeof()


Don wrote:

Lars T. Kyllingstad wrote:

Don wrote:
I'm trying to make sense of the rules for 'typeof'. It's difficult 
because DMD's behaviour is so different to the spec. Here's four 
simple cases.


// This doesn't compile on D1.
//alias typeof(int*int) Alias1;

// This compiles in D1, but not in D2.
alias int Int;
alias typeof(Int*Int) Alias2;

// Yet this DOES compile on D2 !
typeof(T*U) foo(T, U)(T x, U y) { return x*y; }
alias typeof(foo(Int, Int)) Alias3;

// And this fails on both D1 and D2, with a dreadful error message.
//alias typeof(foo(int)) Alias4;

I can't see anything in the spec to say why ANY of these examples 
should compile. Yet, the existing template constraints features 
relies on the Alias3 case.



Here are a few more:

class Foo { real bar() { return 1.0; } }
Foo foo = new Foo;

// Passes, but should fail.
static assert (is (typeof(foo.bar) == function));

// Passes, as expected.
static assert (is (typeof(foo.bar) == delegate));

// Passes, but should fail. This is similar to Don's examples.
static assert (is (typeof(Foo.bar) == function));

// This one fails with the following hilarious message:
// Error: static assert  (is(real function() == function)) is false
static assert (is (typeof(Foo.bar) == function));

I have no idea why typeof(Foo.bar) even works, but it does. Foo is 
completely meaningless.


Dot has higher precedence than , so it means (Foo.bar), not (Foo).bar.

The static assert fails because real function() is a *function 
pointer*, but is(xxx == function) tests to see if xxx is a *function*, 
not a *function pointer*.


So this passes:
void function () goo;
static assert( is (typeof(*goo) == function));

It's pretty awful that that in is(real function() == function), the 
keyword 'function' has two contradictory meanings in the same 
expression. And the spec never says what function type is.



Ok, I see now. It's not wrong then, just ugly. :)

  static assert (is (int delegate() == delegate));  // passes
  static assert (is (int function() == function));  // fails


What about my first example then, is that the intended behaviour as 
well? With the current property syntax, I'd expect this to work, but it 
doesn't:


  static assert (is (typeof(foo.bar) == typeof(foo.bar(;

  Error: static assert  (is(real() == real)) is false

-Lars

Re: The bizarre world of typeof()


grauzone wrote:

Lars T. Kyllingstad wrote:

Kagamin wrote:

Lars T. Kyllingstad Wrote:


 // This one fails with the following hilarious message:
 // Error: static assert  (is(real function() == function)) is 
false

 static assert (is (typeof(Foo.bar) == function));


Failure is valid, compiler just can't show member function types 
correctly.



I'm not saying it should compile, I'm saying that the compiler should 
give an error when it encounters the expression Foo.bar, and not just 
because of the failed assertion. It's bad enough that it accepts 
Foo.bar (this is what Don was talking about), but allowing one to take 
the address as well is just nonsense -- even when it's in an 
is(typeof()) expression.


In fact, Foo.bar actually returns an address. The following compiles:

class Foo { real bar() { return 1.0; } }
auto f = Foo.bar;
auto x = f();

Of course, when run, it segfaults on the last line. I wonder where f 
actually points to.


We need that to get the address of a function. It's just that the type 
of the returned object is a bit bogus: it's not really a function; it's 
a method pointer casted to a function pointer. It simply has the wrong 
calling convention.


Ok, thanks for explaining. Are there cases where it's useful to have a 
pointer to a member function without its context?



We also need typeof(Foo.bar) to get the parameter and return types for 
the bar method.


Good point.

-Lars

Re: [OT] What should be in a programming language?

2009-10-26 Thread Jason House

Yigal Chripun Wrote:

 On 25/10/2009 06:26, Jason House wrote:
 
  My web search and some PDF's didn't turn up a handy example. You can
  do things in scala like define your own foreach loop. If foreach had
  the form form foreach(x){y} then x would be one set of arguments and
  y would be another set. It makes for pretty use of library functions.
  They look built in!
 
 
 isn't that similar in concept to code blocks?


I'm not familiar enough with code blocks to say for sure. From what I saw in 
blogs, they are not. Either way, D can't make things look built in like scala 
can. IMHO, it's a great programming language feature.

 

  I looked over the links (quickly). I must admit I don't get it yet.
  It takes me a while to digest lisp fragments... Can you give a D-ish
  example of what it'd look like?
 
 
 here's a Nemerle example:
 
 macro PrintStage() {
System.Console.WriteLine(This is executed during compilation);
[ System.Console.WriteLine(This is executed at run time) ]
 }
 
 the first WriteLine is executed during compilation, and the macro 
 returns the AST for the second WriteLine which will be executed at run 
 time when this macro is called.

How is that different from a normal function definition that includes some 
compile-time calls? I agree that compile-time code should look and feel like 
normal code. It seems you use macro to switch to compile-time by default and 
runtime when explcitly marked? Having both defaults (compile time or run time) 
makes sense.




 one important design goal is to clearly separate the stages, so this 
 will go to a separate .d file and will be compiled into a lib.
 to use this macro you simply specify
 compiler --load-macro=myMacro sources.d
 
 in user code you just use print();

I disagree with this. The code that uses the macros should declare what it uses.


 
 OK, here's an example:
 
 class Foo {
 int a;
 void bar();
 }
 
 auto obj = new Foo;
 obj.a = 42; // obj contains a
 obj.bar();  // calls 'Foo.vtbl.bar
 
 remember that 'Foo is the classinfo singelton for Foo
 
 class Foo {
 static a;
 static void bar();
 }
 
 Foo.a = 42; // 'Foo contains a
 Foo.bar(); // calls ''Foo.vtbl.bar
 
 ''Foo is the classinfo singelton for 'Foo
 
 we get the following chain (-- means instance of)
 obj -- Foo -- MetaFoo -- MetaClass -- Class
 
 compared with C++/D/Java/etc:
 obj -- Foo -- Class

Ok. That makes sense. It can be simplified when statics are removed.

dsimcha, el 26 de octubre a las 13:08 me escribiste:
I just realized last night that D's templates are probably powerful enough now
to generate bit masks that can be used for precise GC heap scanning. I'm
halfway (emphasis on halfway) thinking of using this to try to hack the GC and
make heap scanning fully precise except for the corner case of unions.
However, this ties into several things that others in the D community are
doing, so I want to gauge people's responses and make sure I'm not wasting
effort on something that will be useless in 6 months.

1. Sean, Leonardo, whoever else may be working on GC implementations, have
you by any chance broken ground on precise heap scanning already?

Maybe you're talking about me. I didn't have the chance to play with this
yet. My main goal is to make the collect run concurrently with the
mutator, but I have been a little busy lately so I didn't make many
advances yet. I will like to play with adding preciseness to the GC too if
I have the time.

2. Andrei, Walter, how close are we to actually eliminating new from the
language? If all allocations were done by either calling GC.malloc() or using
templates that call GC.malloc(), then things would get a lot simpler than if I
were to have to hack the compiler to make new pass type info to the GC.

The runtime is already receiving the type information on the allocated
object when new is used AFAIK, but this information is not propagated to
gc_malloc(). So it shouldn't be too hard to add type information to the
GC. There was some interesting discussion about this some time ago.
http://www.digitalmars.com/d/archives/digitalmars/D/Std_Phobos_2_and_logging_library_87794.html#N87831

3. I'm thinking bit masks could be stored as follows:

When getBitMask!(T) is instantiated, it generates an immutable size_t[N].
Element 0 is the size of the array (to allow for storing only the ptr in the
GC), element 1 is the size of one instance of the object, in bytes. The size
of the memory block must be a multiple of this. Elements 2..$ are all of the
offsets that should be scanned for pointers. For example:

struct Foo {
uint bar;
void* baz;
}

getBitMask!(Foo); // [3, 8, 4].

That leaves the problem of where/how to store the pointers to this information
in the GC efficiently. I haven't gotten that far yet, but I remember some
concerns have been raised in the past about storing 4 bytes per GC object for
a pointer to the bitmask. For my use cases, where I tend to allocate a
relatively small number of relatively large objects, this isn't a problem.
However, in a heavily OO environment, where people allocate tons of tiny
objects, it might be.

In the discussion I mentioned Frits van Bommel proposed a reasonable way
to encode the information efficiently:
http://www.digitalmars.com/d/archives/digitalmars/D/Std_Phobos_2_and_logging_library_87794.html#N87968

--
Leandro Lucarella (AKA luca) http://llucax.com.ar/
--
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05)
--
The Guinness Book of Records holds the record for being the most
stolen book in public libraries

Re: [OT] What should be in a programming language?

Jason House Wrote:

 An array with some compile time values would be easy [foo(7), #bar(8)].

Hmm... these array literals work like crazy in dynamically typed languages, but 
are they the good for statically typed ones?

Re: The bizarre world of typeof()

Denis Koroskin Wrote:

  Lars T. Kyllingstad Wrote:
 
   // This one fails with the following hilarious message:
   // Error: static assert  (is(real function() == function)) is false
   static assert (is (typeof(Foo.bar) == function));
 
  Failure is valid, compiler just can't show member function types  
  correctly.
 
 That's a correct behavior. Foo.bar returns a *function*, not a delegate  
 (because of a lack of context), and  no type information is associated  
 with it.

Yes, the context is not *provided*, but is *required*.
*function* doesn't require context.

Re: The bizarre world of typeof()


Lars T. Kyllingstad wrote:

grauzone wrote:

Lars T. Kyllingstad wrote:

Kagamin wrote:

Lars T. Kyllingstad Wrote:


 // This one fails with the following hilarious message:
 // Error: static assert  (is(real function() == function)) is 
false

 static assert (is (typeof(Foo.bar) == function));


Failure is valid, compiler just can't show member function types 
correctly.



I'm not saying it should compile, I'm saying that the compiler should 
give an error when it encounters the expression Foo.bar, and not 
just because of the failed assertion. It's bad enough that it accepts 
Foo.bar (this is what Don was talking about), but allowing one to 
take the address as well is just nonsense -- even when it's in an 
is(typeof()) expression.


In fact, Foo.bar actually returns an address. The following compiles:

class Foo { real bar() { return 1.0; } }
auto f = Foo.bar;
auto x = f();

Of course, when run, it segfaults on the last line. I wonder where f 
actually points to.


We need that to get the address of a function. It's just that the type 
of the returned object is a bit bogus: it's not really a function; 
it's a method pointer casted to a function pointer. It simply has the 
wrong calling convention.


Ok, thanks for explaining. Are there cases where it's useful to have a 
pointer to a member function without its context?


You could use it to dynamically build a delegate to that function. Or to 
allow serialization of delegates. Maybe there are other uses as well. 
Anyway, you really shouldn't have to instantiate a class just to get 
method addresses.




We also need typeof(Foo.bar) to get the parameter and return types 
for the bar method.


Good point.

-Lars

Re: Thread-local storage and Performance

2009-10-26 Thread Pelle Månsson


dsimcha wrote:

Has D's builtin TLS been optimized in the past 6 months to year?  I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind that are
now __gshared).  Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than __gshared.
 What's changed?


I was under the impression that TLS should be faster due to absence of 
synchronization.

Re: Thread-local storage and Performance

2009-10-26 Thread Denis Koroskin

On Mon, 26 Oct 2009 18:26:02 +0300, Pelle Månsson  
pelle.mans...@gmail.com wrote:



dsimcha wrote:

Has D's builtin TLS been optimized in the past 6 months to year?  I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind  
that are

now __gshared).  Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than  
__gshared.

 What's changed?


I was under the impression that TLS should be faster due to absence of  
synchronization.


__gshared doesn't have any locks/barriers associated with them.
TLS should be slightly slower due to an additional indirection, but I  
don't think it would be noticeable.

Re: Thread-local storage and Performance

== Quote from Pelle Månsson (pelle.mans...@gmail.com)'s article
 dsimcha wrote:
  Has D's builtin TLS been optimized in the past 6 months to year?  I had
  benchmarked it awhile back when optimizing some code that I wrote and
  discovered it was significantly slower than regular globals (the kind that 
  are
  now __gshared).  Now, at least on Windows, it seems that there is no
  discernible difference and if anything, TLS is slightly faster than 
  __gshared.
   What's changed?
 I was under the impression that TLS should be faster due to absence of
 synchronization.

__gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.

Without getting into the details of my specific case, the reason I'm interested 
in
this is that I have some code that I want to be as fast as possible in both
single- and multithreaded environments.  Right now, it has a hack that checks
thread_needLock() and uses plain old globals for everything as long as the 
program
is single-threaded because that seemed faster than TLS lookups a while ago.
However, running the same benchmark again shows otherwise.

Re: TDPL reaches Thermopylae level

2009-10-26 Thread Jeremie Pelletier


Andrei Alexandrescu wrote:

303 pages and counting!

Andrei


Soon the PI level, or at least 10 times PI!

Re: GC Precision

2009-10-26 Thread Sean Kelly

dsimcha Wrote:

 == Quote from Sean Kelly (s...@invisibleduck.org)'s article
  dsimcha Wrote:
   I just realized last night that D's templates are probably powerful 
   enough now
   to generate bit masks that can be used for precise GC heap scanning.  I'm
   halfway (emphasis on halfway) thinking of using this to try to hack the 
   GC and
   make heap scanning fully precise except for the corner case of unions.
   However, this ties into several things that others in the D community are
   doing, so I want to gauge people's responses and make sure I'm not wasting
   effort on something that will be useless in 6 months.
  
   1.  Sean, Leonardo, whoever else may be working on GC implementations, 
   have
   you by any chance broken ground on precise heap scanning already?
  I've thought about it, but not done anything about it.  The compiler doesn't
 provide this information, so precise scanning would require a user-level call.
 You'll also have to deal with arrays of structs, by the way.
 
 Arrays of structs are easy:  Generate a bitmask for one element, and keep 
 reusing
 that bitmask until the end of the block.  Am I missing something?

Nope.

Re: TDPL reaches Thermopylae level

2009-10-26 Thread Bill Baxter

On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier jerem...@gmail.com wrote:
 Andrei Alexandrescu wrote:

 303 pages and counting!

 Andrei

 Soon the PI level, or at least 10 times PI!


A hundred even. ;-)

--bb

Re: GC Precision


Sean Kelly wrote:

dsimcha Wrote:


== Quote from Sean Kelly (s...@invisibleduck.org)'s article

dsimcha Wrote:

I just realized last night that D's templates are probably powerful enough now
to generate bit masks that can be used for precise GC heap scanning.  I'm
halfway (emphasis on halfway) thinking of using this to try to hack the GC and
make heap scanning fully precise except for the corner case of unions.
However, this ties into several things that others in the D community are
doing, so I want to gauge people's responses and make sure I'm not wasting
effort on something that will be useless in 6 months.

1.  Sean, Leonardo, whoever else may be working on GC implementations, have
you by any chance broken ground on precise heap scanning already?

I've thought about it, but not done anything about it.  The compiler doesn't

provide this information, so precise scanning would require a user-level call.
You'll also have to deal with arrays of structs, by the way.

Arrays of structs are easy:  Generate a bitmask for one element, and keep 
reusing
that bitmask until the end of the block.  Am I missing something?


Nope.


One question is, is there enough information for stack variables? My 
understanding from a while ago was that heap data could be reasonably 
analyzed, but stack data has no info associated with it.


Andrei

Re: GC Precision

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 Sean Kelly wrote:
  dsimcha Wrote:
 
  == Quote from Sean Kelly (s...@invisibleduck.org)'s article
  dsimcha Wrote:
  I just realized last night that D's templates are probably powerful 
  enough now
  to generate bit masks that can be used for precise GC heap scanning.  I'm
  halfway (emphasis on halfway) thinking of using this to try to hack the 
  GC and
  make heap scanning fully precise except for the corner case of unions.
  However, this ties into several things that others in the D community are
  doing, so I want to gauge people's responses and make sure I'm not 
  wasting
  effort on something that will be useless in 6 months.
 
  1.  Sean, Leonardo, whoever else may be working on GC implementations, 
  have
  you by any chance broken ground on precise heap scanning already?
  I've thought about it, but not done anything about it.  The compiler 
  doesn't
  provide this information, so precise scanning would require a user-level 
  call.
  You'll also have to deal with arrays of structs, by the way.
 
  Arrays of structs are easy:  Generate a bitmask for one element, and keep 
  reusing
  that bitmask until the end of the block.  Am I missing something?
 
  Nope.
 One question is, is there enough information for stack variables? My
 understanding from a while ago was that heap data could be reasonably
 analyzed, but stack data has no info associated with it.
 Andrei

That's why I said precise *heap* scanning.  This would solve probably 90+% of 
the
problem w/ false pointers without requiring any changes with major ripple 
effects,
i.e. only druntime, not the compiler, would need to be hacked.  Admittedly,
though, unless you came up w/ some pinning scheme for stack variables and 
unions,
it still wouldn't allow a moving GC.

I personally am much more interested in a decent solution to our GC woes now 
than
a perfect one at some point indefinitely far into the future.  Right now, when
working with programs that use more than maybe 100-200 MB of memory, false
pointers become such a problem that the GC is almost useless, yet all kinds of
library code still uses the GC heap, which is why I resisted the idea of 
removing
GC.free() so strongly.  As I see it, the biggest problem is false pointers, with
the fact that every allocation requires a lock in a close second.  These are the
low-hanging fruit.  A moving GC, one that doesn't stop the world on collection,
and one that's fully precise including stack would be nice, but they're several
orders of magnitude less important and would also have more ripple effects.

Re: [OT] What should be in a programming language?

Jason House Wrote:

 How is that different from a normal function definition that includes some 
 compile-time calls? I agree that compile-time code should look and feel like 
 normal code. It seems you use macro to switch to compile-time by default and 
 runtime when explcitly marked? Having both defaults (compile time or run 
 time) makes sense.
 

The way it's implemented in Nemerle, a macro is actually a class. 
the above is not how it works. 
the code inside a macro is regular run-time code. it is compiled into a lib and 
loaded by the compiler as a plugin. 
the code is run at run-time but run-time here means run-time of the compiler 
since it's a plugin of the compiler. 
in nemerle (like in FP) the last value in a function is what the function 
returns. so that macro *returns* an AST representation of what's inside. 
you can use this operator to de/compose AST. 

 
 
  one important design goal is to clearly separate the stages, so this 
  will go to a separate .d file and will be compiled into a lib.
  to use this macro you simply specify
  compiler --load-macro=myMacro sources.d
  
  in user code you just use print();
 
 I disagree with this. The code that uses the macros should declare what it 
 uses.

I meant from a syntax POV - calling a macro is the same as calling a function. 
no template syntax. importing the namespace is still required IIRC.
 
 
  
  OK, here's an example:
  
  class Foo {
  int a;
  void bar();
  }
  
  auto obj = new Foo;
  obj.a = 42; // obj contains a
  obj.bar();  // calls 'Foo.vtbl.bar
  
  remember that 'Foo is the classinfo singelton for Foo
  
  class Foo {
  static a;
  static void bar();
  }
  
  Foo.a = 42; // 'Foo contains a
  Foo.bar(); // calls ''Foo.vtbl.bar
  
  ''Foo is the classinfo singelton for 'Foo
  
  we get the following chain (-- means instance of)
  obj -- Foo -- MetaFoo -- MetaClass -- Class
  
  compared with C++/D/Java/etc:
  obj -- Foo -- Class
 
 Ok. That makes sense. It can be simplified when statics are removed.
 

I don't understand this. How removal of statics simplifies this?
I think that having class shared functions/data should still be possible but 
implemented as above instead of static memory as in c++/D. 
class Foo {
static int value;
}

this still works as in D but value is a member of the singleton object that 
represents Foo at runtime instead of stored in static memory. 

those singletons need to be concurrency friendly unlike the static memory 
design that is definitely is not. 

btw, in dynamic languages like smalltalk/ruby those meta classes are mutable so 
you can for example add methods at run-time. I don't know if this should be 
allowed in a compiled language.

Re: [OT] What should be in a programming language?

2009-10-26 Thread Jason House

Yigal Chripun Wrote:

 Jason House Wrote:
 
  How is that different from a normal function definition that includes some 
  compile-time calls? I agree that compile-time code should look and feel 
  like normal code. It seems you use macro to switch to compile-time by 
  default and runtime when explcitly marked? Having both defaults (compile 
  time or run time) makes sense.
  
 
 The way it's implemented in Nemerle, a macro is actually a class. 
 the above is not how it works. 
 the code inside a macro is regular run-time code. it is compiled into a lib 
 and loaded by the compiler as a plugin. 
 the code is run at run-time but run-time here means run-time of the compiler 
 since it's a plugin of the compiler. 
 in nemerle (like in FP) the last value in a function is what the function 
 returns. so that macro *returns* an AST representation of what's inside. 
 you can use this operator to de/compose AST. 


Your examples in Nemerle or D-ish looked like they are returning strings. I'm 
still not seeing the magic of AST macros.


 
  
 
   OK, here's an example:
   
   class Foo {
   int a;
   void bar();
   }
   
   auto obj = new Foo;
   obj.a = 42; // obj contains a
   obj.bar();  // calls 'Foo.vtbl.bar
   
   remember that 'Foo is the classinfo singelton for Foo
   
   class Foo {
   static a;
   static void bar();
   }
   
   Foo.a = 42; // 'Foo contains a
   Foo.bar(); // calls ''Foo.vtbl.bar
   
   ''Foo is the classinfo singelton for 'Foo
   
   we get the following chain (-- means instance of)
   obj -- Foo -- MetaFoo -- MetaClass -- Class
   
   compared with C++/D/Java/etc:
   obj -- Foo -- Class
  
  Ok. That makes sense. It can be simplified when statics are removed.
  
 
 I don't understand this. How removal of statics simplifies this?

As I understood it 'Foo contains the static data and class info for Foo, and 
''Foo contains class info for 'Foo. Without statics, ''Foo is unnecessary. I'm 
sure I've misinterpreted what you're saying ;)


 I think that having class shared functions/data should still be possible but 
 implemented as above instead of static memory as in c++/D. 
 class Foo {
 static int value;
 }
 
 this still works as in D but value is a member of the singleton object that 
 represents Foo at runtime instead of stored in static memory.

The singleton object should be in static memory... I don't really see the 
distinction since the finer storage details don't affect the programmer.
 
 
 those singletons need to be concurrency friendly unlike the static memory 
 design that is definitely is not. 
 
 btw, in dynamic languages like smalltalk/ruby those meta classes are mutable 
 so you can for example add methods at run-time. I don't know if this should 
 be allowed in a compiled language.

Re: Thread-local storage and Performance

2009-10-26 Thread Walter Bright


dsimcha wrote:

== Quote from Pelle Månsson (pelle.mans...@gmail.com)'s article

dsimcha wrote:

Has D's builtin TLS been optimized in the past 6 months to year?  I had
benchmarked it awhile back when optimizing some code that I wrote and
discovered it was significantly slower than regular globals (the kind that are
now __gshared).  Now, at least on Windows, it seems that there is no
discernible difference and if anything, TLS is slightly faster than __gshared.
 What's changed?

I was under the impression that TLS should be faster due to absence of
synchronization.


__gshared == old-skool cowboy sharing, i.e. plain old unsynchronized globals.

Without getting into the details of my specific case, the reason I'm interested 
in
this is that I have some code that I want to be as fast as possible in both
single- and multithreaded environments.  Right now, it has a hack that checks
thread_needLock() and uses plain old globals for everything as long as the 
program
is single-threaded because that seemed faster than TLS lookups a while ago.
However, running the same benchmark again shows otherwise.


Nothing has changed. What I would do is to look at the assembler output 
and verify that the TLS globals really are TLS, and the ones that are 
not are really not.

Re: GC Precision


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

Sean Kelly wrote:

dsimcha Wrote:


== Quote from Sean Kelly (s...@invisibleduck.org)'s article

dsimcha Wrote:

I just realized last night that D's templates are probably powerful enough now
to generate bit masks that can be used for precise GC heap scanning.  I'm
halfway (emphasis on halfway) thinking of using this to try to hack the GC and
make heap scanning fully precise except for the corner case of unions.
However, this ties into several things that others in the D community are
doing, so I want to gauge people's responses and make sure I'm not wasting
effort on something that will be useless in 6 months.

1.  Sean, Leonardo, whoever else may be working on GC implementations, have
you by any chance broken ground on precise heap scanning already?

I've thought about it, but not done anything about it.  The compiler doesn't

provide this information, so precise scanning would require a user-level call.
You'll also have to deal with arrays of structs, by the way.

Arrays of structs are easy:  Generate a bitmask for one element, and keep 
reusing
that bitmask until the end of the block.  Am I missing something?

Nope.

One question is, is there enough information for stack variables? My
understanding from a while ago was that heap data could be reasonably
analyzed, but stack data has no info associated with it.
Andrei


That's why I said precise *heap* scanning.  This would solve probably 90+% of 
the
problem w/ false pointers without requiring any changes with major ripple 
effects,
i.e. only druntime, not the compiler, would need to be hacked.  Admittedly,
though, unless you came up w/ some pinning scheme for stack variables and 
unions,
it still wouldn't allow a moving GC.

I personally am much more interested in a decent solution to our GC woes now 
than
a perfect one at some point indefinitely far into the future.  Right now, when
working with programs that use more than maybe 100-200 MB of memory, false
pointers become such a problem that the GC is almost useless, yet all kinds of
library code still uses the GC heap, which is why I resisted the idea of 
removing
GC.free() so strongly.  As I see it, the biggest problem is false pointers, with
the fact that every allocation requires a lock in a close second.  These are the
low-hanging fruit.  A moving GC, one that doesn't stop the world on collection,
and one that's fully precise including stack would be nice, but they're several
orders of magnitude less important and would also have more ripple effects.


Absolutely! I think that's great work. Thanks for clarifying things for me.

Andrei

Re: TDPL reaches Thermopylae level


Bill Baxter wrote:

On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier jerem...@gmail.com wrote:

Andrei Alexandrescu wrote:

303 pages and counting!

Andrei

Soon the PI level, or at least 10 times PI!



A hundred even. ;-)


Coming along. I'm writing about strings and Unicode right now. I was 
wondering what people think about allowing concatenation (with ~ and ~=) 
of strings of different character widths. The support library could do 
all of the transcoding.


(I understand that concatenating an array of wchar or char with a dchar 
is already in bugzilla.)



Andrei

Re: TDPL reaches Thermopylae level

2009-10-26 Thread Bill Baxter

On Mon, Oct 26, 2009 at 11:51 AM, Andrei Alexandrescu
seewebsiteforem...@erdani.org wrote:
 Bill Baxter wrote:

 On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier jerem...@gmail.com
 wrote:

 Andrei Alexandrescu wrote:

 303 pages and counting!

 Andrei

 Soon the PI level, or at least 10 times PI!


 A hundred even. ;-)

 Coming along. I'm writing about strings and Unicode right now. I was
 wondering what people think about allowing concatenation (with ~ and ~=) of
 strings of different character widths. The support library could do all of
 the transcoding.

 (I understand that concatenating an array of wchar or char with a dchar is
 already in bugzilla.)

So a common way to convert wchar to char might then become ~myWcharString?

That seems kind of odd.  Just using something like
to!(char[])(myWcharString) seems less goofy to me.

But that subjective reaction is all I have against it.

--bb

Re: TDPL reaches Thermopylae level

2009-10-26 Thread Jeremie Pelletier


Andrei Alexandrescu wrote:

Bill Baxter wrote:
On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier 
jerem...@gmail.com wrote:

Andrei Alexandrescu wrote:

303 pages and counting!

Andrei

Soon the PI level, or at least 10 times PI!



A hundred even. ;-)


Coming along. I'm writing about strings and Unicode right now. I was 
wondering what people think about allowing concatenation (with ~ and ~=) 
of strings of different character widths. The support library could do 
all of the transcoding.


(I understand that concatenating an array of wchar or char with a dchar 
is already in bugzilla.)



Andrei


I don't know if thats a good idea, its better when string encoding is 
explicit so you know where your reallocations are.


ie if I know some routine will have to convert a utf16 parameter to utf8 
to append it to a string, then ill try and either make it output utf16 
or input utf8. If its implicit its much harder to find and optimize 
these cases.


to!string() is easy enough to use anyways.

But it could be good to add a range type that does this with multiple 
opAppend/opAppendAssign overloads.

Re: TDPL reaches Thermopylae level


Jeremie Pelletier wrote:

Andrei Alexandrescu wrote:

Bill Baxter wrote:
On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier 
jerem...@gmail.com wrote:

Andrei Alexandrescu wrote:

303 pages and counting!

Andrei

Soon the PI level, or at least 10 times PI!



A hundred even. ;-)


Coming along. I'm writing about strings and Unicode right now. I was 
wondering what people think about allowing concatenation (with ~ and 
~=) of strings of different character widths. The support library 
could do all of the transcoding.


(I understand that concatenating an array of wchar or char with a 
dchar is already in bugzilla.)



Andrei


I don't know if thats a good idea, its better when string encoding is 
explicit so you know where your reallocations are.


The beauty of it is that reallocation with ~ occurs anyway, and with ~= 
is anyway imminent, regardless of the character width you're reallocating.


Allowing concatenation of strings of different widths is a nice way of 
acknowledging at the language level that all character widths are 
encodings of abstract characters.


ie if I know some routine will have to convert a utf16 parameter to utf8 
to append it to a string, then ill try and either make it output utf16 
or input utf8. If its implicit its much harder to find and optimize 
these cases.


to!string() is easy enough to use anyways.

But it could be good to add a range type that does this with multiple 
opAppend/opAppendAssign overloads.


One problem with

s ~= to!string(someDstring);

is that it does two allocations instead of one.


Andrei

Re: TDPL reaches Thermopylae level


Bill Baxter wrote:

On Mon, Oct 26, 2009 at 11:51 AM, Andrei Alexandrescu
seewebsiteforem...@erdani.org wrote:

Bill Baxter wrote:

On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier jerem...@gmail.com
wrote:

Andrei Alexandrescu wrote:

303 pages and counting!

Andrei

Soon the PI level, or at least 10 times PI!


A hundred even. ;-)

Coming along. I'm writing about strings and Unicode right now. I was
wondering what people think about allowing concatenation (with ~ and ~=) of
strings of different character widths. The support library could do all of
the transcoding.

(I understand that concatenating an array of wchar or char with a dchar is
already in bugzilla.)


So a common way to convert wchar to char might then become ~myWcharString?

That seems kind of odd.


Well, I guess. In particular, to me it's not clear what type we should 
assign to a concatenation between a string and a wstring. With ~=, it's 
much easier...



 Just using something like
to!(char[])(myWcharString) seems less goofy to me.


Problem is, an append + one transcoding requires two allocations. We 
could always define routines in std.string or std.utf:


append(s, ws); // s ~= ws

but really it's quite unambiguous what ~= should do. A nod from the 
language is a nice touch.



Andrei

Re: Private enum members + Descent


On 26/10/2009 12:07, Lars T. Kyllingstad wrote:

Yigal Chripun wrote:

personally I'd like to see D enums replaced by Java style enums which
make more sense to me. D enums are even worse than C enums since you
can write:
enum foo = text;

which to me looks very similar to:
auto cat = new Dog;



I agree that enum is a horrible keyword to use for declaring manifest
constants. In my opinion the D developers are sometimes a bit too afraid
of introducing new keywords, and this is one of the consequences.

Personally, I think this would be a better scheme:

const: manifest constants, no storage (like const in D1, enum in D2)
readonly: used for a read-only view of mutable data (like const in D2)
immutable: truly immutable data (like now)

-Lars


i don't think we need to add more options, I think we need to remove 
options.

there should be only two types, const and mutable.
manifest constants should be a linker optimization and immutable is only 
relevant as part of a concurrency model and even then there's no need 
for a separate keyword.


immutable objects are shared objects where only the owner is allowed to 
modify it and only if it's done in a thread-safe way (synchronized with 
locks, lock-free algorithms).

void foo() {
// can be optimized away (transformed into a manifest constant)
// taking address of var is illegal
const var = 42;
// allocated on the heap and you can pass it's address around
auto bar = new const int(42);
}

Re: [OT] What should be in a programming language?


On 26/10/2009 14:47, Kagamin wrote:

Yigal Chripun Wrote:


for instance there's special handling of void return types so it would
be easier to work with in generic code. instead of this compiler hack a
much simpler solution is to have a unit type and ditch C style void. the
bottom type should also exist mainly for completeness and for a few
stdlib functions like abort() and exit()


uint and void return types may be nearly equivalent for x86 architecture, CLI 
makes strong difference between them.


I have no idea what uint has to do with what I said.
in type theory, a unit type contains only one value, and a bottom type 
contains zero values.

the single value of unit can be for example an empty tuple.

a function like abort doesn't return anything at all so it's return type 
is the bottom type.


In ML all functions have exactly one tuple argument and one tuple return 
type.

so, for example this c function:
void foo();
would have the following signature in ML:
unit - unit
if we have:
void foo();
void bar();

foo(bar()); is perfectly legal with ML semantics since both functions 
have the signature: unit - unit

Re: [OT] What should be in a programming language?


On 26/10/2009 20:30, Jason House wrote:


Your examples in Nemerle or D-ish looked like they are returning
strings. I'm still not seeing the magic of AST macros.


When we want to decompose some large code (or more precisely, its syntax
tree), we must bind its smaller parts to variables. Then we can process
them recursively or just use them in an arbitrary way to construct the
result.

We can operate on entire subexpressions by writing $( ... ) or $ID
inside the quotation operator [ ... ]. This means binding the value of
ID or the interior of parenthesized expression to the part of syntax
tree described by corresponding quotation.

macro for (init, cond, change, body)
{
  [
$init;
def loop () : void {
  if ($cond) { $body; $change; loop() }
  else ()
};
loop ()
  ]
}


he above macro defines function for, which is similar to the loop known
from C. It can be used like this

for (mutable i = 0, i  10, i++, printf (%d, i))
/quote

the above is taken from the macros_tutorial page of nemerle.org.
unfortunately the site is down so I'm using Google's cache instead.

there are a few more related topics: Constructs with variable number of
elements, hygiene, ...

Re: GC Precision

I've spent some free brain cycles today thinking about this and here's the 
scheme
I have in mind.  If anyone thinks this could be improved in a way that would not
have substantial ripple effects throughout the compiler/language (because then 
it
might never actually get implemented) let me know.

1.  GC.malloc's signature changes to GC.malloc(size_t size, uint ba = 0, size_t*
bitmask = null).  A null bitmask means use the old-school behavior and either 
scan
everything or don't scan anything based on the NO_SCAN bit.  IMHO plain old
conservative scanning must be supported to allow for untyped memory blocks to be
allocated, because requiring every memory block to have a type associated with 
it
is an unacceptable limitation in a systems language.

For now, the only way to get precise scanning would be to call malloc directly 
or
use a template that does.  Eventually new would either be fixed to instantiate 
the
bitMask!(T) template or eliminated entirely.  Since using the precise scanning 
by
writing a template that calls GC.malloc() is a lot easier than hacking the
compiler, this may be a catalyst for getting rid of new.

2.  Some concern has been expressed in the past about the possibility of using 4
bytes per block in overhead to store pointers to bitmasks.  IMHO this concern is
misplaced because in any program that takes up enough memory for space 
efficiency
to matter, false pointers waste more space.  Furthermore, if you're programming
for a toaster oven, elevator controller, etc., you probably aren't going to use
the standard out-of-the-box GC anyhow.  However, I've thought of ways to 
mitigate
this and have come up with the following:

A.  Store a pointer to the bitmask iff the NO_SCAN bit is set.
this is a no-brainer and will prevent any overhead on,
for example, arrays of floats.

B.  Add another attributes bit to the GC called something like
NEEDS_BITMASK.  This bit would be set iff an object mixes
pointers and non-pointers.  If it's not set, no bitmask
pointer would be stored.  However, the overhead of an
extra bit may or may not be worth it.

3.  The bitmask pointer would be stored at the end of every GC-allocated block 
for
which a bitmask pointer is stored.  The reason for using the end of the block
instead of the beginning is just implementation simplicity:  That way, finding 
the
beginning of a block would work the same whether or not we have a bitmask 
pointer.

4.  The bitmask would be a size_t[] created at compile time by a template and
stored in the static data segment.  Its layout would be [length of array,
T.sizeof, offsets that need to be scanned].  For example, if you have something 
like:

struct Foo {
uint bar;
void* ptr;
}

On a 32-bit machine, bitMask!Foo would be [3, 8, 4].  On a 64-bit, it would be 
[3,
16, 8].  The reason the size of the array is stored in the array is so that we 
can
get away with storing a single ptr in each memory block instead of a pointer 
and a
length.

5.  To store information about pinning, we simply use the high-order bits of the
pointer offsets.  1 means pinned, 0 means not pinned.  This means that, for any
type T, T.sizeof can't be bigger than size_t.max / 2.  I think this is a fairly
minor limitation.

Re: TDPL reaches Thermopylae level

2009-10-26 Thread Bill Baxter

On Mon, Oct 26, 2009 at 4:05 PM, Jeremie Pelletier jerem...@gmail.com wrote:
 Andrei Alexandrescu wrote:

 Jeremie Pelletier wrote:

 Andrei Alexandrescu wrote:

 Bill Baxter wrote:

 On Mon, Oct 26, 2009 at 8:47 AM, Jeremie Pelletier jerem...@gmail.com
 wrote:

 Andrei Alexandrescu wrote:

 303 pages and counting!

 Andrei

 Soon the PI level, or at least 10 times PI!


 A hundred even. ;-)

 Coming along. I'm writing about strings and Unicode right now. I was
 wondering what people think about allowing concatenation (with ~ and ~=) of
 strings of different character widths. The support library could do all of
 the transcoding.

 (I understand that concatenating an array of wchar or char with a dchar
 is already in bugzilla.)


 Andrei

 I don't know if thats a good idea, its better when string encoding is
 explicit so you know where your reallocations are.

 The beauty of it is that reallocation with ~ occurs anyway, and with ~= is
 anyway imminent, regardless of the character width you're reallocating.

 Allowing concatenation of strings of different widths is a nice way of
 acknowledging at the language level that all character widths are encodings
 of abstract characters.

 ie if I know some routine will have to convert a utf16 parameter to utf8
 to append it to a string, then ill try and either make it output utf16 or
 input utf8. If its implicit its much harder to find and optimize these
 cases.

 to!string() is easy enough to use anyways.

 But it could be good to add a range type that does this with multiple
 opAppend/opAppendAssign overloads.

 One problem with

 s ~= to!string(someDstring);

 is that it does two allocations instead of one.


 Andrei

 Good points, I didn't think of the separation between characters and
 encodings or the extra allocation from to.

 You have my vote for this feature then!

 Jeremie


Yeh, me too.  Saving an allocation is good.  And I agree that having
~= do a conversion is much more useful than just getting an error.
Its one of those things you might try just hoping it will work, and
it's always nice when something like that does just what you hope it
will.

I guess the only other thing I could worry about is that in generic
array code it might cause someone headaches that for some T[],   T[]
~= S[] is legal and the length of the result is not the same as the
lengths of the inputs.  But I can't think of any real situation where
that would cause trouble.

--bb

Re: GC Precision

2009-10-26 Thread Leandro Lucarella

Andrei Alexandrescu, el 26 de octubre a las 11:01 me escribiste:
 Sean Kelly wrote:
 dsimcha Wrote:
 
 == Quote from Sean Kelly (s...@invisibleduck.org)'s article
 dsimcha Wrote:
 I just realized last night that D's templates are probably powerful 
 enough now
 to generate bit masks that can be used for precise GC heap scanning.  I'm
 halfway (emphasis on halfway) thinking of using this to try to hack the 
 GC and
 make heap scanning fully precise except for the corner case of unions.
 However, this ties into several things that others in the D community are
 doing, so I want to gauge people's responses and make sure I'm not wasting
 effort on something that will be useless in 6 months.
 
 1.  Sean, Leonardo, whoever else may be working on GC implementations, 
 have
 you by any chance broken ground on precise heap scanning already?
 I've thought about it, but not done anything about it.  The compiler 
 doesn't
 provide this information, so precise scanning would require a user-level 
 call.
 You'll also have to deal with arrays of structs, by the way.
 
 Arrays of structs are easy:  Generate a bitmask for one element, and keep 
 reusing
 that bitmask until the end of the block.  Am I missing something?
 
 Nope.
 
 One question is, is there enough information for stack variables? My
 understanding from a while ago was that heap data could be
 reasonably analyzed, but stack data has no info associated with it.

There is some discussion about precise stack in the thread I cited too.
The stack can be also precise adding a shadow stack (like a TypeInfo for
the stack) but it's costly and it needs compiler support (LLVM has some
mechanisms to generate this stack information AFAIK but I never played
with it).

The problem with the stack is that you still have to interact with C,
which has no stack information, so it's a little more controversial about
how much the extra complexity will pay off.

I agree with David that it's much more reasonable to make the heap precise
first and then see how thing are going from that.

-- 
Leandro Lucarella (AKA luca) http://llucax.com.ar/
--
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
--
hypocrite opportunist
don't infect me with your poison

Re: GC Precision

2009-10-26 Thread Leandro Lucarella

dsimcha, el 26 de octubre a las 23:05 me escribiste:
 I've spent some free brain cycles today thinking about this and here's the 
 scheme
 I have in mind.  If anyone thinks this could be improved in a way that would 
 not
 have substantial ripple effects throughout the compiler/language (because 
 then it
 might never actually get implemented) let me know.
 
 1.  GC.malloc's signature changes to GC.malloc(size_t size, uint ba = 0, 
 size_t*
 bitmask = null).  A null bitmask means use the old-school behavior and either 
 scan
 everything or don't scan anything based on the NO_SCAN bit.  IMHO plain old
 conservative scanning must be supported to allow for untyped memory blocks to 
 be
 allocated, because requiring every memory block to have a type associated 
 with it
 is an unacceptable limitation in a systems language.
 
 For now, the only way to get precise scanning would be to call malloc 
 directly or
 use a template that does.  Eventually new would either be fixed to 
 instantiate the
 bitMask!(T) template or eliminated entirely.  Since using the precise 
 scanning by
 writing a template that calls GC.malloc() is a lot easier than hacking the
 compiler, this may be a catalyst for getting rid of new.

Did you read the thread I posted? What do you think about Fritz's idea on
how to encode the pointers information? I'm not very familiar with
TypeInfo, but wouldn't be more natural to pass the TypeInfo to GC.malloc()
directly if it can get the pointer information itself instead of
translating that information? I think if that's possible it will keep the
GC interface simpler.

 2.  Some concern has been expressed in the past about the possibility of 
 using 4
 bytes per block in overhead to store pointers to bitmasks.  IMHO this concern 
 is
 misplaced because in any program that takes up enough memory for space 
 efficiency
 to matter, false pointers waste more space.  Furthermore, if you're 
 programming
 for a toaster oven, elevator controller, etc., you probably aren't going to 
 use
 the standard out-of-the-box GC anyhow.

This seems reasonable, but I don't see why we can't use a more efficient
way to store this information in the GC. Implementation simplicity (to
have a better GC now instead of a perfect GC in some point in a far
future) is a good enough reason :) I'm just curious if you found any flaws
in the scheme proposed by Frits or is just you want a simpler
implementation.

 However, I've thought of ways to mitigate this and have come up with the
 following:
 
 A.  Store a pointer to the bitmask iff the NO_SCAN bit is set.
 this is a no-brainer and will prevent any overhead on,
 for example, arrays of floats.

Good idea.

 B.  Add another attributes bit to the GC called something like
 NEEDS_BITMASK.  This bit would be set iff an object mixes
 pointers and non-pointers.  If it's not set, no bitmask
 pointer would be stored.  However, the overhead of an
 extra bit may or may not be worth it.

You mean for the special case where all the attributes of an object are
pointers? I think this should be rare enough, so I doubt about the
utility, but maybe I'm missing some common case.

 3.  The bitmask pointer would be stored at the end of every GC-allocated 
 block for
 which a bitmask pointer is stored.  The reason for using the end of the block
 instead of the beginning is just implementation simplicity:  That way, 
 finding the
 beginning of a block would work the same whether or not we have a bitmask 
 pointer.

Did you even consider storing this information outside the memory block
(like the flags)? I think storing them in the memory block can be annoying
if they are not stored always because now your fixed sized blocks are not
fixed. It might be very easy to overcome this, but maybe thinking about
the other option is worthy.

 4.  The bitmask would be a size_t[] created at compile time by a template and
 stored in the static data segment.  Its layout would be [length of array,
 T.sizeof, offsets that need to be scanned].  For example, if you have 
 something like:

Again, I wonder if this information can't be still obtained from the
TypeInfo.

 struct Foo {
 uint bar;
 void* ptr;
 }
 
 On a 32-bit machine, bitMask!Foo would be [3, 8, 4].  On a 64-bit, it would 
 be [3,
 16, 8].  The reason the size of the array is stored in the array is so that 
 we can
 get away with storing a single ptr in each memory block instead of a pointer 
 and a
 length.
 
 5.  To store information about pinning, we simply use the high-order bits of 
 the
 pointer offsets.  1 means pinned, 0 means not pinned.  This means that, for 
 any
 type T, T.sizeof can't be bigger than size_t.max / 2.  I think this is a 
 fairly
 minor limitation.

I like from Frits's proposal that information about weak pointer were
added too, this might fix another big missing in the memory management
area in D.

-- 
Leandro Lucarella (AKA luca)

Re: GC Precision

== Quote from Leandro Lucarella (llu...@gmail.com)'s article
 dsimcha, el 26 de octubre a las 23:05 me escribiste:
  I've spent some free brain cycles today thinking about this and here's the 
  scheme
  I have in mind.  If anyone thinks this could be improved in a way that 
  would not
  have substantial ripple effects throughout the compiler/language (because 
  then it
  might never actually get implemented) let me know.
 
  1.  GC.malloc's signature changes to GC.malloc(size_t size, uint ba = 0, 
  size_t*
  bitmask = null).  A null bitmask means use the old-school behavior and 
  either scan
  everything or don't scan anything based on the NO_SCAN bit.  IMHO plain old
  conservative scanning must be supported to allow for untyped memory blocks 
  to be
  allocated, because requiring every memory block to have a type associated 
  with it
  is an unacceptable limitation in a systems language.
 
  For now, the only way to get precise scanning would be to call malloc 
  directly or
  use a template that does.  Eventually new would either be fixed to 
  instantiate the
  bitMask!(T) template or eliminated entirely.  Since using the precise 
  scanning by
  writing a template that calls GC.malloc() is a lot easier than hacking the
  compiler, this may be a catalyst for getting rid of new.
 Did you read the thread I posted? What do you think about Fritz's idea on
 how to encode the pointers information? I'm not very familiar with
 TypeInfo, but wouldn't be more natural to pass the TypeInfo to GC.malloc()
 directly if it can get the pointer information itself instead of
 translating that information? I think if that's possible it will keep the
 GC interface simpler.

As far as I can tell, RTTI doesn't have all the stuff that's needed yet for
structs and adding it would require hacking the compiler.  Frankly, I want this 
to
be simple enough to actually get implemented as opposed to just being talked 
about
and deadlocking on everyone waiting on everyone else to do something.

  2.  Some concern has been expressed in the past about the possibility of 
  using 4
  bytes per block in overhead to store pointers to bitmasks.  IMHO this 
  concern is
  misplaced because in any program that takes up enough memory for space 
  efficiency
  to matter, false pointers waste more space.  Furthermore, if you're 
  programming
  for a toaster oven, elevator controller, etc., you probably aren't going to 
  use
  the standard out-of-the-box GC anyhow.
 This seems reasonable, but I don't see why we can't use a more efficient
 way to store this information in the GC. Implementation simplicity (to
 have a better GC now instead of a perfect GC in some point in a far
 future) is a good enough reason :) I'm just curious if you found any flaws
 in the scheme proposed by Frits or is just you want a simpler
 implementation.

Two things:  Implementation simplicity is one.  As I've been alluding to, worse
and works in practice is better than better and only exists on paper.  The other
is that I don't understand, in Frits's approach, how do you store the size of 
the
object so you know how many bits to interpret as the mask?

  However, I've thought of ways to mitigate this and have come up with the
  following:
 
  A.  Store a pointer to the bitmask iff the NO_SCAN bit is set.
  this is a no-brainer and will prevent any overhead on,
  for example, arrays of floats.
 Good idea.
  B.  Add another attributes bit to the GC called something like
  NEEDS_BITMASK.  This bit would be set iff an object mixes
  pointers and non-pointers.  If it's not set, no bitmask
  pointer would be stored.  However, the overhead of an
  extra bit may or may not be worth it.
 You mean for the special case where all the attributes of an object are
 pointers? I think this should be rare enough, so I doubt about the
 utility, but maybe I'm missing some common case.

You're probably right.  The only common one I can think of is arrays of classes.
For an array, the 4 bytes of overhead is usually negligible.

  3.  The bitmask pointer would be stored at the end of every GC-allocated 
  block for
  which a bitmask pointer is stored.  The reason for using the end of the 
  block
  instead of the beginning is just implementation simplicity:  That way, 
  finding the
  beginning of a block would work the same whether or not we have a bitmask 
  pointer.
 Did you even consider storing this information outside the memory block
 (like the flags)? I think storing them in the memory block can be annoying
 if they are not stored always because now your fixed sized blocks are not
 fixed. It might be very easy to overcome this, but maybe thinking about
 the other option is worthy.

The flags in the GC are designed to store single bits apparently.  Also, as far 
as
weak refs, we could use another high-order bit for that.  I don't think anyone
cares if (on 32-bit) T.sizeof is limited to about a gigabyte.  The question is,

Re: Disallow catch without parameter (LastCatch)

2009-10-26 Thread Christopher Wright


grauzone wrote:

Christopher Wright wrote:


Please keep full attributions.

PS: I wonder, should the runtime really execute finally blocks if an 
Error exception is thrown? (Errors are for runtime errors, 
Exception for normal exceptions.) Isn't it dangerous to execute 
arbitrary user code in presence of what is basically an internal error?


Are all Errors unrecoverable except by immediately aborting the 
application?


What about logging?

What about putting up a reasonable error message for the user?

What about restarting the failed module in case the issue was 
temporary and environmental?


Something is wrong with your program internally if something like this 
happens. You can't expect a consistent program state. And most of the 
code in finally blocks was not written for such situations. You'll 
probably end up throwing another runtime error from within a finally block.


Quite possibly. But immediately terminating the process is simply not 
acceptable. How am I going to fix this problem if I can't even log that 
a problem occurred? If I have a SaaS application, I have to rely on my 
users to email or call up to find out something bad happened?


What if it's an assertion error or a bounds error in a plugin? I can 
unload that plugin and continue on with no issues.


I'm getting OutOfMemoryErrors? I'll disable caching and prefetching and 
reduce my memory footprint by 80%. Problem solved.


There is one category of errors that is not recoverable. If the runtime 
is left in an inconsistent state, it should try to output an error 
message and terminate. Everything else, an application could potentially 
handle.

Bug? GC collects memory that references itself.

2009-10-26 Thread Jeremie Pelletier

I need objects that may live without any references in GC memory, this 
is for bindings to a C++ library. Using ranges or roots would be very 
inneficient so my solution was to have the objects reference themselves 
until the C++ side of the object calls a finalizer which nullify the 
reference to let the GC collect the memory.


The GC doesn't see things this way however, and collects the objects 
even when they reference themselves.


I've made a simple test program, the objects should never get collected.

---

import core.memory;
import std.stdio;

class Foo {
Foo self;
this() { self = this; }
~this() { assert(!self); }
}

void main() {
foreach(i; 0 .. 50) new Foo;
GC.collect();
writeln(No object collected!);
}

---

If its a feature of the GC to prevent objects from never being 
collected, how do I bypass it?


Jeremie

[Issue 3440] invalid -X JSON output, a comma is missing

http://d.puremagic.com/issues/show_bug.cgi?id=3440


Ary Borenszweig a...@esperanto.org.ar changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||a...@esperanto.org.ar
 Resolution||DUPLICATE


--- Comment #1 from Ary Borenszweig a...@esperanto.org.ar 2009-10-26 01:01:09 
PDT ---
Please always search bugs before creating new ones. :-)

*** This issue has been marked as a duplicate of issue 3415 ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---

[Issue 3442] New: scope(exit) Problem

http://d.puremagic.com/issues/show_bug.cgi?id=3442

   Summary: scope(exit) Problem
   Product: D
   Version: 2.035
  Platform: x86
OS/Version: Windows
Status: NEW
  Severity: regression
  Priority: P2
 Component: DMD
AssignedTo: nob...@puremagic.com
ReportedBy: m...@vermi.fr


--- Comment #0 from Vermi m...@vermi.fr 2009-10-26 04:00:32 PDT ---
I don't know if it's really a dmd bug or it's a windows bug, because I made
several tests and I don't really understand where is the error. Here is the
code : (Working with earlier version of dmd (2.022 if I remember well) :

protected bool _onPaint(PAINTSTRUCT paintStr)
{
  int oldMode = SetBkMode(paintStr.hdc, TRANSPARENT);
  scope(exit)
  {
SetBkMode(paintStr.hdc, oldMode);
MessageBoxA(null, tic, tic, MB_OK);
  }

  // Some code

  return true;
}


If the function return without exception, the code in the scope statement is
properly executed and functionnal, the Background Mode for the DC is restored
to OPAQUE (it's original value).

If an exception is thrown ( with the code : (cast(Object)null).toString() for
example ), The MessageBox show, but the background mode the DC is not restored,
and the background simply disapear in the window.

I made some tests on the return value of SetBkMode :

If an exception is thrown, SetBkMode return OPAQUE (wich is strange, it should
return TRANSPARENT according to msdn).

If no exception is thrown, SetBkMode return 0, wich is strange too, because 0
means error. I don't know what to think.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---

[Issue 3443] New: Thread.thread_needLock() should be const pure nothrow

http://d.puremagic.com/issues/show_bug.cgi?id=3443

   Summary: Thread.thread_needLock() should be const pure nothrow
   Product: D
   Version: 2.035
  Platform: Other
OS/Version: Windows
Status: NEW
  Severity: normal
  Priority: P2
 Component: druntime
AssignedTo: s...@invisibleduck.org
ReportedBy: dsim...@yahoo.com


--- Comment #0 from David Simcha dsim...@yahoo.com 2009-10-26 06:21:38 PDT ---
All it does is return a boolean member variable, so it clearly is really const
pure nothrow.  Fixing this would allow me to remove a serious kludge from some
of my code.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---

[Issue 3123] std.algorithm.zip fails on 'lazy' ranges

http://d.puremagic.com/issues/show_bug.cgi?id=3123


Andrei Alexandrescu and...@metalanguage.com changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||and...@metalanguage.com
 AssignedTo|nob...@puremagic.com|and...@metalanguage.com


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---

[Issue 3444] New: foreach(i, elem; range) should work