Just a thought: pure functions & a compacting GC

2009-03-18 Thread Daniel Keep

Just thinking out loud; no proposals or anything.  :)

I've been floating around the idea of doing some "for fun" programming
on my Wii.  Of course, I'd like to use D.  One of the concerns I have is
that the Wii has a tiny amount of memory compared to a PC (about 76 MB
or so) so you don't want to generate too much garbage.

There seem to be three kinds of allocations: big and/or long-lived
allocations, which aren't much of an issue; medium-lived objects that
get allocated, used and mutated, then destroyed (scope deals with most
of these,) and transient garbage.  Transient garbage includes
side-effects of calling other functions or even language features.

We really don't have a "good" way of dealing with these tiny,
short-lived allocations.  One thought I had was to extend the GC to
include a new method:

auto result = GC.scoped( /* lazy expr, function ptr or delegate */ );

This would call the given code and return the result.  Just before it
returns, however, it makes a deep-copy of the result and then nukes all
the pages it allocated during the call.

Deep copying isn't a tremendous problem since we know the type of the
result, and we know which pointers are inside the to-be-nuked pages.
The only fly in the ointment would be if we'd stored a reference to the
to-be-nuked pages in external variables, like a global.

Which is exactly what a pure function is prohibited from doing.

Also note that the above scoped call is basically doing what a
compacting GC does: compacting live allocations, and then dropping the
garbage off the end.

This brings up two interesting questions:

1. Should D 2.0 have a similar GC.scoped call?

2. Should D 2.0 do automatic compaction for pure functions?

Thoughts?

  -- Daniel


reconsideration of header files

2009-03-18 Thread davidl

The support of generating header is limited.

dmd -H -o- abc.d
still emits error messages, i.e., semantic checks still run, I didn't work  
out a way of bypassing the semantic check by commandline.



Then, I assume header files only for interfacing, it helps a little for  
boosting up compile speed. Only way of siginificantly boost the  
compilation speed is feeding dmd everything at one time. An obvious  
example is compiling dwt , almost every files need 3 sec to be converted  
as an obj. By the way, rebuild 0.78 seems always compile one at a time, it  
seems that is related to dmd module system change.


I'm just curious where the time goes. 3 sec for compiling 1 file to obj is  
almost unbearable limit for most developers. I've already converted all  
dwt .d files to .di files by utilyzing the dmd -H option. There seems to  
be some other time consuming stuffs.


D header files are never used widely in projects. Maybe we should discuss  
it a little bit further.


The Joy of Signalling NaNs! (A compiler patch)

2009-03-18 Thread Don
It's great that D initializes floating-point variables to NaN, instead 
of whatever random garbage happened to be in RAM.
But, if your calculation ends up with a NaN, you have to work out where 
it came from. Worse, the NaN might not necessarily

be visible in your final results, but you results may nonetheless be wrong.

The hardware has excellent support for debugging these problems -- all 
you need to do is activitating the floating-point 'invalid' trap,
and you'll get a hardware exception whenever you _create_ a NaN. What 
about uninitialized variables?
Signalling NaNs are designed for exactly this situation. The instant you 
access a signalling NaN, a hardware exception occurs,
and you drop straight into your debugger (just like accessing a null 
pointer).
But this could only work if the compiler initialized every uninitialised 
floating-point variable to a signalling NaN.


Now that we have access to the backend (thanks Walter!), we can do 
exactly that. My patch(below) is enabled only when compiled with DMC.
real.nan is unchanged, and won't cause exceptions if you use it, but 
real.init is now a signalling nan.
This doesn't make any difference to anything, until you enable FP 
exceptions.
And when you do, if no exceptions occur, you can use the code coverage 
feature to give very high confidence that you're not using any 
uninitialised floating-point variables.


I propose that this should become part of DMD. It doesn't need to be in 
the spec, it's primarily for debugging.


Don.

==
Example of usage:
==

void main()
{
double a, b, c;
a*=7;// Exactly the same as it is now, a is nan.

enableExceptions();

c = 6;  // ok, c is initialized now
c *= 10;
b *= 10;// BANG ! Straight into the debugger
b *= 5;

disableExceptions();
}

---

void enableExceptions() {
version(D_InlineAsm_X86) {
 short cont;
 asm {
 fclex;
 fstcw cont;
 mov AX, cont;
 and AX, 0xFFFE; // enable invalid exception
 mov cont, AX;
 fldcw cont;
 }
 }
 }

void disableExceptions() {
version(D_InlineAsm_X86) {
 short cont;
 asm {
 fclex;
 fstcw cont;
 mov AX, cont;
 or AX, 0x1; // disable invalid exception
 mov cont, AX;
 fldcw cont;
 }
 }
 }


=
Patches to DMD to turn all unitialized floats into SNANs.

Changes are in mytype.c and e2ir.c
=
mytype.c:
=
line 2150:

Expression *TypeBasic::defaultInit(Loc loc)
{   integer_t value = 0;
#if __DMC__
// Note: could be up to 16 bytes long.
	unsigned short snan[8] = { 0x, 0x, 0x, 0xBFFF, 0x7FFF, 0, 
0, 0 };

d_float80 fvalue = *(long double*)snan;
#endif

line 2177:

case Tfloat80:
#if __DMC__
return new RealExp(loc, fvalue, this);
#else
return getProperty(loc, Id::nan);
#endif

line 2186:

case Tcomplex80:
#if __DMC__
{   // Can't use fvalue + I*fvalue (the im part becomes a quiet 
NaN).
complex_t cvalue;
((real_t *)&cvalue)[0] = fvalue;
((real_t *)&cvalue)[1] = fvalue;
return new ComplexExp(loc, cvalue, this);
}
#else
return getProperty(loc, Id::nan);
#endif
=
e2ir.c line 1182.
=

bool isSignallingNaN(real_t x)
{
#if __DMC__
if (x>=0 || x<0) return false;
return !unsigned short*)&x)[3])&0x4000);
#else
return false;
#endif
}


elem *RealExp::toElem(IRState *irs)
{   union eve c;
tym_t ty;

//printf("RealExp::toElem(%p)\n", this);
memset(&c, 0, sizeof(c));
ty = type->toBasetype()->totym();
switch (tybasic(ty))
{
case TYfloat:
case TYifloat:
c.Vfloat = value;
if (isSignallingNaN(value) ) {
((unsigned int*)&c.Vfloat)[0] &= 0xFFBFL;
}
break;

case TYdouble:
case TYidouble:
c.Vdouble = value; // this unfortunately converts SNAN to QNAN.
if ( isSignallingNaN(value) ) {
((unsigned int*)&c.Vdouble)[1] &= 0xFFF7L;
}
break;

case TYldouble:
case TYildouble:
c.Vldouble = value;
break;

default:
print();
type->print();
type->toBasetype()->print();
printf("ty = %d, tym = %x\n", type->ty, ty);
assert(0);
}
return el_const(ty, &c);
}

elem *ComplexExp::toElem(IRState *irs)
{   union eve c;
tym_t ty;
real_t re;
real_t im;

re = creall(value);
im = cimagl(value);

memset(&c, 0, si

A bug in the back-end?

2009-03-18 Thread Sergey Gromov
I was reading the 1.041's back-end code.  Here's something that looked
wrong to me: in cgelem.c, in function elcmp(), at line 3318:

case 4:
if (sz > 2)
e = el_una(OPu32_64,TYshort,e);
else
e = el_una(OP32_16,TYshort,e);
break;

Specifically the line

e = el_una(OPu32_64,TYshort,e);

looks wrong, the unary operator type should be 64 bit, not short.  Am I
missing something?


Re: new D2.0 + C++ language

2009-03-18 Thread Robert Jacques

On Wed, 18 Mar 2009 13:48:55 -0400, Craig Black  wrote:


bearophile Wrote:


Weed:
> I want to offer the dialect of the language D2.0, suitable for use  
where

> are now used C/C++. Main goal of this is making language like D, but
> corresponding "zero-overhead principle" like C++:
>...
> The code on this language almost as dangerous as a code on C++ - it  
is a

> necessary cost for increasing performance.

No, thanks...

And regarding performance, eventually it will come a lot from a good  
usage of multiprocessing, that in real-world programs may need pure  
functions and immutable data. That D2 has already, while C++ is less  
lucky.


Bye,
bearophile


Multiprocessing can only improve performance for tasks that can run in  
parallel.  So far, every attempt to do this with GC (that I know of) has  
ended up slower, not faster.  Bottom line, if GC is the bottleneck, more  
CPU's won't help.


For applications where GC performance is unacceptable, we either need a  
radically new way to do GC faster, rely less on the GC, or drop GC  
altogether.


However, in D, we can't get rid of the GC altogether, since the compiler  
relies on it.  But we can use explicit memory management where it makes  
sense to do so.


-Craig


*Sigh*, you do know people run cluster & multi-threaded Java apps all the  
time right? I'd recommend reading about concurrent GCs  
http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Stop-the-world_vs._incremental_vs._concurrent.  
By the way, traditional malloc has rather horrible multi-threaded  
performance as 1) it creates lots of kernel calls and 2) requires a global  
lock on access. Yes, there are several alternatives available now, but the  
same techniques work for enabling multi-threaded GCs. D's shared/local  
model should support thread local heaps, which would improve all of the  
above.


Re: class Exception

2009-03-18 Thread Sean Kelly
== Quote from Steve Teale (steve.te...@britseyeview.com)'s article
> Denis Koroskin Wrote:
> > On Wed, 18 Mar 2009 17:01:25 +0300, Steve Teale 
> >  wrote:
> >
> > > Another dumb question - where is class Exception defined/described?
> >
> > dmd1\src\phobos\object.d
> > dmd2\src\druntime\import\object.di
> Yes, eventually found it. Also dmd2, 
> dmd\src\druntime\src\compiler\dmd\object_.d. I was looking
because I remembered seeing once a distinction being made between class 
Exception, and class Error
> These seem to be defined identically - is one of them defunct?

They're base classes for different categories of errors, somewhat like
in Java.  See core/exception.di for a run-down.


Re: new D2.0 + C++ language

2009-03-18 Thread Christopher Wright

Weed wrote:

bearophile пишет:

Weed:

I want to offer the dialect of the language D2.0, suitable for use where
are now used C/C++. Main goal of this is making language like D, but
corresponding "zero-overhead principle" like C++:
...
The code on this language almost as dangerous as a code on C++ - it is a
necessary cost for increasing performance.

No, thanks...

And regarding performance, eventually it will come a lot from a good
usage of multiprocessing,


The proposal will be able support multiprocessing - for it provided a
references counting in the debug version of binaries. If you know the
best way for language *without GC* guaranteeing the existence of an
object without overhead - I have to listen!


You cannot alter the reference count of an immutable variable.


Re: new D2.0 + C++ language

2009-03-18 Thread Sergey Gromov
Wed, 18 Mar 2009 13:48:55 -0400, Craig Black wrote:

> bearophile Wrote:
> 
>> Weed:
>>> I want to offer the dialect of the language D2.0, suitable for use where
>>> are now used C/C++. Main goal of this is making language like D, but
>>> corresponding "zero-overhead principle" like C++:
>>>...
>>> The code on this language almost as dangerous as a code on C++ - it is a
>>> necessary cost for increasing performance.
>> 
>> No, thanks...
>> 
>> And regarding performance, eventually it will come a lot from a good usage 
>> of multiprocessing, that in real-world programs may need pure functions and 
>> immutable data. That D2 has already, while C++ is less lucky.
>> 
>> Bye,
>> bearophile
> 
> Multiprocessing can only improve performance for tasks that can run
> in parallel.  So far, every attempt to do this with GC (that I know
> of) has ended up slower, not faster.  Bottom line, if GC is the
> bottleneck, more CPU's won't help. 
> 
> For applications where GC performance is unacceptable, we either need
> a radically new way to do GC faster, rely less on the GC, or drop GC
> altogether. 
> 
> However, in D, we can't get rid of the GC altogether, since the
> compiler relies on it.  But we can use explicit memory management
> where it makes sense to do so. 
> 
> -Craig

I think that the "shared" memory concept in D2 is introduced
specifically to improve multi-processing GC performance.  There going to
be thread-local GC for every thread allocating memory, and, since
thread-local will be the default allocation strategy, most memory will
be GCed without synchronizing with other threads.


Re: new D2.0 + C++ language

2009-03-18 Thread Craig Black
bearophile Wrote:

> Weed:
> > I want to offer the dialect of the language D2.0, suitable for use where
> > are now used C/C++. Main goal of this is making language like D, but
> > corresponding "zero-overhead principle" like C++:
> >...
> > The code on this language almost as dangerous as a code on C++ - it is a
> > necessary cost for increasing performance.
> 
> No, thanks...
> 
> And regarding performance, eventually it will come a lot from a good usage of 
> multiprocessing, that in real-world programs may need pure functions and 
> immutable data. That D2 has already, while C++ is less lucky.
> 
> Bye,
> bearophile

Multiprocessing can only improve performance for tasks that can run in 
parallel.  So far, every attempt to do this with GC (that I know of) has ended 
up slower, not faster.  Bottom line, if GC is the bottleneck, more CPU's won't 
help.

For applications where GC performance is unacceptable, we either need a 
radically new way to do GC faster, rely less on the GC, or drop GC altogether.

However, in D, we can't get rid of the GC altogether, since the compiler relies 
on it.  But we can use explicit memory management where it makes sense to do so.

-Craig


Re: eliminate writeln et comp?

2009-03-18 Thread Sean Reque
Ruby handles this by putting such functions on its IO class as instance 
methods, and then defining global functions that are just aliases to more 
verbosely calling instance methods on global IO objects. For instance, puts is 
just an alias to $stdout.puts, gets is just an alias to $stdin.gets, etc. That 
way, people who want to type as little as possible to write Hello World 
programs are happy, and people who want a consistent object-oriented syntax and 
the benefits of polymorphism are also happy. Everyone is happy!


D2 std_array is a dead link

2009-03-18 Thread Frank Benoit
http://www.digitalmars.com/d/2.0/phobos/std_array.html

Not Found

The requested URL /d/2.0/phobos/std_array.html was not found on this server.


Re: new D2.0 + C++ language

2009-03-18 Thread BCS

Reply to Weed,


If you know the
best way for language *without GC* guaranteeing the existence of an
object without overhead - I have to listen!



Never delete anything?

One of the arguments for GC is that it might well have /less/ overhead than 
any other practical way of managing dynamic memory. Yes you can be very careful 
in keeping track of pointers (not practical) or use smart pointers and such 
(might end up costing more than GC) but neither is particularly nice.





Re: class Exception

2009-03-18 Thread Denis Koroskin

On Wed, 18 Mar 2009 17:30:07 +0300, Steve Teale  
wrote:


Denis Koroskin Wrote:

On Wed, 18 Mar 2009 17:01:25 +0300, Steve Teale  
 wrote:


> Another dumb question - where is class Exception defined/described?

dmd1\src\phobos\object.d
dmd2\src\druntime\import\object.di


Yes, eventually found it. Also dmd2,  
dmd\src\druntime\src\compiler\dmd\object_.d. I was looking because I  
remembered seeing once a distinction being made between class Exception,  
and class Error


These seem to be defined identically - is one of them defunct?


I believe the former is automatically generated from the latter one.


Re: class Exception

2009-03-18 Thread Steve Teale
Denis Koroskin Wrote:

> On Wed, 18 Mar 2009 17:01:25 +0300, Steve Teale 
>  wrote:
> 
> > Another dumb question - where is class Exception defined/described?
> 
> dmd1\src\phobos\object.d
> dmd2\src\druntime\import\object.di

Yes, eventually found it. Also dmd2, 
dmd\src\druntime\src\compiler\dmd\object_.d. I was looking because I remembered 
seeing once a distinction being made between class Exception, and class Error

These seem to be defined identically - is one of them defunct?


Re: class Exception

2009-03-18 Thread Denis Koroskin

On Wed, 18 Mar 2009 17:01:25 +0300, Steve Teale  
wrote:


Another dumb question - where is class Exception defined/described?


dmd1\src\phobos\object.d
dmd2\src\druntime\import\object.di


class Exception

2009-03-18 Thread Steve Teale
Another dumb question - where is class Exception defined/described?


Re: new D2.0 + C++ language

2009-03-18 Thread Weed
bearophile пишет:
> Weed:
>> I want to offer the dialect of the language D2.0, suitable for use where
>> are now used C/C++. Main goal of this is making language like D, but
>> corresponding "zero-overhead principle" like C++:
>> ...
>> The code on this language almost as dangerous as a code on C++ - it is a
>> necessary cost for increasing performance.
> 
> No, thanks...
> 
> And regarding performance, eventually it will come a lot from a good
> usage of multiprocessing,

The proposal will be able support multiprocessing - for it provided a
references counting in the debug version of binaries. If you know the
best way for language *without GC* guaranteeing the existence of an
object without overhead - I have to listen!

> that in real-world programs may need pure
> functions and immutable data.

I do not see any problems with this

> That D2 has already, while C++ is less lucky.



Re: new D2.0 + C++ language

2009-03-18 Thread Weed
Kagamin пишет:
> Kagamin Wrote:
> 
>> Weed Wrote:
>>
>>> - Its does not contains garbage collection and
>>> - allows active using of a stack for the objects (as in C++)
>>> - Its uses syntax and a tree of objects taken from the D
>> you just need to add syntactical support for it.
> 

Yes

> ...well, you already has it with structure constructors...

Remember that we have already discussed this here several times, and
came to the conclusion (?) that emulation of the "value semantic" by
structs unreasonably difficult



Re: new D2.0 + C++ language

2009-03-18 Thread Weed
Weed пишет:
> Hi!
> 

colorized example:
http://paste.dprogramming.com/dpd6j5co


Re: new D2.0 + C++ language

2009-03-18 Thread Kagamin
Kagamin Wrote:

> Weed Wrote:
> 
> > - Its does not contains garbage collection and
> > - allows active using of a stack for the objects (as in C++)
> > - Its uses syntax and a tree of objects taken from the D
> 
> you just need to add syntactical support for it.

...well, you already has it with structure constructors...


Re: new D2.0 + C++ language

2009-03-18 Thread Kagamin
Weed Wrote:

> - Its does not contains garbage collection and
> - allows active using of a stack for the objects (as in C++)
> - Its uses syntax and a tree of objects taken from the D

Garbage collection can be turned off already, you can get rid of it just by 
minor modification of druntime, stack allocation is also a minor modification 
to the compiler, you just need to add syntactical support for it.


Re: new D2.0 + C++ language

2009-03-18 Thread bearophile
Weed:
> I want to offer the dialect of the language D2.0, suitable for use where
> are now used C/C++. Main goal of this is making language like D, but
> corresponding "zero-overhead principle" like C++:
>...
> The code on this language almost as dangerous as a code on C++ - it is a
> necessary cost for increasing performance.

No, thanks...

And regarding performance, eventually it will come a lot from a good usage of 
multiprocessing, that in real-world programs may need pure functions and 
immutable data. That D2 has already, while C++ is less lucky.

Bye,
bearophile


new D2.0 + C++ language

2009-03-18 Thread Weed
Hi!

I want to offer the dialect of the language D2.0, suitable for use where
are now used C/C++. Main goal of this is making language like D, but
corresponding "zero-overhead principle" like C++:

- Its does not contains garbage collection and
- allows active using of a stack for the objects (as in C++)
- Its uses syntax and a tree of objects taken from the D

The code on this language almost as dangerous as a code on C++ - it is a
necessary cost for increasing performance.

Compiler for that language does not exist! And it is unlikely that I
will be able to do it. In any case, I want to discuss before thinking
about compiler.

I just give you an example of using, without a description of pure
syntax because I do not propose anything new for those who remember C++
and D.

Ask the questions!


/*
Demonstration of a new dialect of the D language
It describes only what differs from D2.0

This language is compatible with C and (probably) D ABI, but not with C++.
*/


/*
Structures and classes are completely similar, except structs are having
controlled
alignment and the lack of polymorphism.

The structures are POD and fully compatible with the structures of the
language C.
*/
struct S {
int var0;
int var1;
int var2;

// Structs constructors are entirely same as in a classes:
this() {
var1 = 5;
}
}

interface I {
void incr();
}

// The structures is a POD.
// They supports inheritance without a polymorphism and they support
// interfaces too.
struct SD : S, I {
int var3;
int var4;

void incr() { ++var3; ++var4; }

/*
Structs constructors are similar to the class constructors.
Calling base constructor super () is required.
*/
this() {
super();
var4 = 8;
}
}

class C, I {
int var;
void incr() { ++var; }

/*
Instead of overloading the operator "=" for classes and structures,
there is present constructor, same as the copy constructor in C++ -
in the parameters it accepts only
object of the same type.
This is differs from D and the need to ensure that copy constructor
can change the source object
(for example, to copy objects linked to the linked-list).

Unlike the C++ constructor, it first makes a copy of the bitwise
the original object, then an additional postblit, the same way as
occurs in D2.0. This allows increase performance of copying than in
C++.

And about references:
When compiling with "-debug" option compiler builds binary with the
reference-counting.
This approach is criticized by Walter Bright there:
http://www.digitalmars.com/d/2.0/faq.html#reference-counting

But, if the language is not have GC, reference-counting is a good
way to make sure that
the object which it references exists. The cost - an additional
pointer dereferencing and
checking the counter (and this is only when compiling with option
"-debug"!).
*/
this( ref C src ) {
var3 = src.var3;
var4 = src.var4;
}
}

class CD : C {
real var2;
void dumb_method() {};
}


void func()
{
/*
Classes to be addressed in the heap by pointer. "*" need to
distinguish the classes in
heap of classes in the stack. I.e., creating classes and structures
takes place the same as creating
them in the C++.
*/
CD  cd_stack; // Creates class in a stack
CD* cd_heap = new CD; // Creates class in a heap, new returns
  // pointer (same as in C++)
C*  c_heap  = new C;
C   c_stack;

// Copying of a objects (same as in C++)
cd_stack = *cd_heap;
*cd_heap = cd_stack;

/*
Copying of a pointers to the objects

"c_heap" pointer points to the object "cd_heap", with the object to
 which the previously pointed "c_heap" is not removed (as there is
no GC and not used smartpointer template).
This is memory leak!
*/
c_heap = cd_heap;

/*
"Slicing" demo:

As a parent object is copied from derived class with additional
fields and methods. The "real var2" field data is not available in
"c_stack" and not will be copied:
*/
c_stack = *cd_heap;

/*
Attempt to place an object of type C into the derived object of type
CD. Field real var2 is not filled by C object. There field now
contains garbage:
*/
cd_stack = c_stack;
cd_stack.var2; // <- garbage data

}


Re: utf-8?

2009-03-18 Thread Daniel Keep


Steve Teale wrote:
> Gide Nwawudu Wrote:
> 
>> On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
>>  wrote:
>>
>>> import std.stdio;
>>>
>>> void main()
>>> {
>>>   string s = "Die Walk�re";
>>>   writefln(s);
>>> }
>>>
>>> Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page 
>>> that claims to be utf-8. What's happening?
>> Works for me, you should save the file as UTF-8 and set your codepage
>> to 65001.
>> C:\> dmd test.d
>>
>> C:\> test
>> Die Walk+?re
>>
>> C:\>chcp 65001
>> Active code page: 65001
>>
>> C:\>test
>> Die Walk�re
>>
>> Gide
> 
> Yup, that does it. I'd missed the encoding option in notepad. What were you 
> running the program in - in a cmd window I see graphics characters.

You have to configure CMD to use Lucida Console as the font.  Also note
that CMD won't do fallbacks like virtually every other Windows app: if a
character isn't in Lucida Console, you won't see it.

  -- Daniel


Re: eliminate writeln et comp?

2009-03-18 Thread Don

Andrei Alexandrescu wrote:

Hey all y'all,


Here's another nice bicycle shed discussion. During the recent 
discussion about globals being harmful, Walter told me something that 
made me think. I said, hey, there are things that are global - look at 
stdout. He said, well, that's a bad thing. He then argued that it would 
be better and cleaner to write:


stdout.writeln("Hello, world");

instead of the current:

writeln("Hello, world");

On one hand, I agree with Walter. On the other, I want to avoid the 
phenomenon of the all-too-long "Hello, world" example.


What do you think?


Andrei


I have always thought of writefln as the flagship function of Phobos. 
It's now looking as though the connection between Phobos2 and Phobos1 is 
not really any stronger than between Tango1 and Phobos1.

(This is an observation, not intended to be critical in any way).


Re: utf-8?

2009-03-18 Thread Steve Teale
Gide Nwawudu Wrote:

> On Tue, 17 Mar 2009 10:48:56 -0400, Steve Teale
>  wrote:
> 
> >import std.stdio;
> >
> >void main()
> >{
> >   string s = "Die Walküre";
> >   writefln(s);
> >}
> >
> >Gives error - invalid utf-8 sequence. I pasted the text from a Wiki page 
> >that claims to be utf-8. What's happening?
> 
> Works for me, you should save the file as UTF-8 and set your codepage
> to 65001.
> C:\> dmd test.d
> 
> C:\> test
> Die Walk+?re
> 
> C:\>chcp 65001
> Active code page: 65001
> 
> C:\>test
> Die Walküre
> 
> Gide

Yup, that does it. I'd missed the encoding option in notepad. What were you 
running the program in - in a cmd window I see graphics characters.


Re: Octal literals: who uses this?

2009-03-18 Thread Don

Walter Bright wrote:

Don wrote:

and what the heck does "\00\0\000\" mean?


It doesn't matter, because if you're translating C code to D, the code 
is probably correct even if you don't know what it means.


Note that in C, you can't reasonably have \0 embedded in a string. But 
in both D and C# you can. So the "\" case isn't really a problem for 
C. It's far more likely in D that someone would write:

"1st\02nd\03rd\04th\0";
and expect it to work.

I doubt there is much extant C code which uses octal. Automated 
translations of octal literals can be done accurately, and you're even 
supplying the 'htod' converter!


htod is not intended for creating implementation source code. It's just 
for headers. I expect most C translations will be done by hand.


The point is that a reasonable fraction of the few remaining instances 
of octal literals, will be machine translated, and will therefore be 
free from these errors.




Note that C# doesn't have octal literals, but does include \0. So 
there's a precedent for dropping them. This also means that right now, 
converting code from C# to D can also introduce obscure bugs. I'd 
argue that that's a scenario that is at least as likely as bugs from C.


It is a good point, but I don't see people translating C# to D. But I do 
see translating C to D (I do it myself!).



I think the argument for octal is very, very weak.


The issue is really the cost of it being in vs the benefit of pulling it 
out. I see very little cost of leaving it in, so it doesn't need much 
benefit to make it worthwhile.


Inertia is the strongest argument, I think.
Octal-related bugs may occur
(1) when translating from ancient C code, if octal is removed.
(2) when translating from C#, if octal is retained.
(3) when writing new D code, if octal is retained.

IMHO, (2) and (3) are more probable than (1). However, all 3 cases are 
quite unlikely. It's extremely low on the list of priorities.