Re: Tricky semantics of ranges & potentially numerous Phobos bugs

2012-10-18 Thread Don Clugston

On 17/10/12 23:41, H. S. Teoh wrote:

On Wed, Oct 17, 2012 at 12:55:56PM -0700, Jonathan M Davis wrote:
[...]

I'm increasingly convinced that input ranges which are not forward
ranges are useless for pretty much anything other than foreach. Far
too much requires that you be able to save the current state - and
most stuff _inherently_ requires it such that it's not simply a
question of implementing the function differently.


It's perfectly possible to implement joiner, chain, find, count, cmp,
equal, until, filter, map, reduce, without assuming that the value
returned by .front is persistent. Just to name a few. In fact, it's even
possible to implement cartesianProduct in which one of the ranges is an
input range.  I'd hardly call that useless.



And adding even further restrictions on input ranges just makes it
worse. It actually wouldn't hurt my feelings one whit if we got rid of
the idea of input ranges entirely.


The motivating example for input ranges, at least according to TDPL, is
find(). There's nothing about find() that precludes non-forward input
ranges. A lot would be missing from the usefulness of ranges if we were
forced to only use forward ranges.


[...]

Regardless, there's nothing in how input ranges are currently defined
which indicates that front would ever be invalidated for _any_ type of
range, and ByLine and ByChunk are pretty much the only ranges I've
ever seen which invalidate previous calls to front. So, I don't see
how you could think that they're anything but abnormal.


I can think of quite a few situations in which it's useful to not assume
that the return value of .front is persistent, which I've already
mentioned before: in-place array permutation, reused buffers for complex
computations, etc..



And if you really want to argue that whether front can be invalidated
or not is somehow part of the difference between an input range and a
forward range, then the documentation on that needs to make that
_very_ clear, and it's going to be that much worse to deal with input
ranges which aren't forward ranges.

[...]

I think I'm not so sure about Andrei's lumping input ranges with
persistent return values from .front together with forward ranges. Some
algorithms, like findAdjacent, do not need a forward range, but they do
need a persistent .front. I do not like the idea of artificially
limiting the scope of findAdjacent just because you can't assume input
ranges' .front returns a persistent value. Like somebody else mentioned,
whether .front is transient or not is orthogonal to whether the range is
an input range or a forward range. There can be ranges whose .front is
persistent, but they can't be forward ranges for practical reasons.


Is it actually orthogonal? Is it possible for a forward range to be 
transient?


Or is it an intermediate concept?
TransientInputRange -> NonTransientInputRange -> ForwardRange





Re: make install; where do .di files go?

2012-10-18 Thread Manu
On 18 October 2012 09:22, David Nadlinger  wrote:

> On Wednesday, 17 October 2012 at 23:33:44 UTC, Manu wrote:
>
>> That I support his comments and suggestion.
>>
>
> Oh, really just that then. I wasn't quite sure if you were still
> worried about
>
>
>  Although that seems sad; D shouldn't identify its self as
>> the second coming of D, since that basically implies that the first
>> coming was a failure.
>>
>
> David
>

Well,
1) if there IS legitimate concern of conflict with D1, then it must be /d2,
I'm just dubious that any conflict would actually exist, and new-comers
(ideally, the majority of the community in the future) would probably
presume /d and wonder what this /d2 is all about. Someone mentioned the
tango object.di case... I don't know anything about that, but I'll presume
others know better than me.
and 2) I don't really care where it is, I would just like an answer that is
agree'd by the various compilers, and which is included in each of their
search paths by default. How can one make a build script for their apps
when it's not consistent where libraries are to be found?
A package manager is all well and good, but it doesn't exist yet. We need
to nominate a standard path in the mean time.

I don't have to go adding -I/usr/include when compiling C code, that's the
point. If a lib was installed by some standard distribution, you should
just be able to use it in your app without trouble. This is particularly
important when dealing with D bindings for C libs, since it basically has
to work within the C conventions to find the lib, the headers should be
distributed similarly.


Re: Tricky semantics of ranges & potentially numerous Phobos bugs

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 07:09:04 UTC, Don Clugston wrote:

On 17/10/12 23:41, H. S. Teoh wrote:

Is it actually orthogonal? Is it possible for a forward range 
to be transient?


Or is it an intermediate concept?
TransientInputRange -> NonTransientInputRange -> ForwardRange


Save just means the range can save its position. If it is 
returning via a buffer, Forward of not, it is going to be 
transient.


Take this forward range, that returns the strings "A", "B" and 
"C" ad infinitum:


//
enum _ABC = "ABC";

struct ABC
{
char[1] buf = _ABC[0];
size_t i;

enum empty = false;
@property char[] front(){return buf;}
void popFront()
{
++i;
buf[0] = _ABC[i%3];
}
@property ABC save()
{
return this;
}
}
//

This is a perfectly valid range, which you can save, but the 
returned string is transient:


//
void main()
{
  ABC abc;

  writeln("Printing 10 elements: ");
  abc.take(10).writeln('\n');

  writeln("Duplicating range");
  auto abc2 = abc.save;
  abc.popFront;
  foreach(v; zip(abc, abc2).take(5))
write("[", v[0], ", ", v[1], "]");
  writeln('\n');

  writeln("Prnting two consecutive elements:");
  auto first = abc.front;
  abc.popFront();
  auto second = abc.front;
  writeln("[", first, ", ", second, "]");
}
//

Produces:

//
Printing 10 elements:
["A", "B", "C", "A", "B", "C", "A", "B", "C", "A"]

Duplicating range
[B, A][C, B][A, C][B, A][C, B]

Prnting two consecutive elements:
[C, C]
//

As you can see, you can perfectly iterate.
You can perfectly save the range. The saved range can be used to 
backtrack.
But if you attempt to store two consecutive fronts, things don't 
go well.


The same holds true for a Random Access range BTW.

Iteration and transient-ness of returned value are orthogonal 
concepts


Re: Tricky semantics of ranges & potentially numerous Phobos bugs

2012-10-18 Thread monarch_dodra
On Wednesday, 17 October 2012 at 19:56:08 UTC, Jonathan M Davis 
wrote:

[SNIP]
I'm increasingly convinced that input ranges which are not 
forward ranges are

useless for pretty much anything other than foreach.
[SNIP]
- Jonathan M Davis


That's already a pretty big usage :) The thing is you just can't 
"do" anything with them, but that *is* the design.


Just read them once to place them into another container. The 
fact they interface with, say "array", or "appender", or copy, 
makes the interface convenient.


The fact that byLine will choke on a call to "array", IMO, has 
nothing to do with it being a forward range.


Re: Bits rotations

2012-10-18 Thread Iain Buclaw
On 18 October 2012 03:36, bearophile  wrote:
> In cryptographic code, and generally in bit-twiddling code, rotation of the
> bits in a word is a common operation. It's so common that Intel has made it
> an asm instruction.
>
> A demo program:
>
>
> uint foo(in uint x) pure nothrow {
> return (x << 11) | (x >> (32 - 11));
> }
>
> uint bar(uint x, in ubyte n) pure nothrow {
> asm {
> mov EAX, x;
> mov CL, n;
> rol EAX, CL;
> mov x, EAX;
> }
> return x;
> }
>
> void main() {
> import std.stdio;
> uint x = 4290772992U;
> writefln("%032b", x);
> uint y = foo(x);
> writefln("%032b", y);
> uint z = bar(x, 11);
> writefln("%032b", z);
> }
>
>
> Its output, dmd -O -release -inline:
>
> 1100
> 0110
> 0110
>
>
> Even with full optimizations DMD seems not able to detect the rotation in
> foo(), and writing assembly as in bar() is not an option, because the code
> is even longer and bar() can't be inlined. This is the ASM generated (32
> bit):
>
>
> _D4test3fooFNaNbxkZk:
> pushEAX
> mov ECX,[ESP]
> shl EAX,0Bh
> shr ECX,015h
> or  EAX,ECX
> pop ECX
> ret
>
> _D4test3barFNaNbkxhZk:
> pushEBP
> mov EBP,ESP
> pushEAX
> mov EAX,8[EBP]
> mov CL,-4[EBP]
> rol EAX,CL
> mov 8[EBP],EAX
> mov EAX,8[EBP]
> mov ESP,EBP
> pop EBP
> ret 4
>
>
> GDC 4.6.3 does better, recognizing the rol (foo() can be inlined, usually
> becoming 1 instruction in the middle of other code) (I like the demangling
> here!):
>
> pure nothrow uint example.foo(const(uint)):
> movl%edi, %eax
> roll$11, %eax
> ret
>
> pure nothrow uint example.bar(uint, const(ubyte)):
> movl  %edi, -4(%rsp)
> movb  %sil, -5(%rsp)
> movl -4(%rsp), %eax
> movb -5(%rsp), %cl
> roll %cl, %eax
> movl %eax, -4(%rsp)
> movl -4(%rsp), %eax
> ret
>
>
> So I'd like to write:
>
> uint spam(in uint x) pure nothrow @safe {
> import core.bitop: rol;
> return rol(x, 11);
> }
>
>
> This is better because:
> - It's standard. If a CPU supports rol (or ror) the compiler uses it. If it
> doesn't support it, the compiler uses shifts and an or.
> - It's shorter and more readable. For me "rol(x, 11)" is rather more easy to
> read and debug than code like "(x << 11) | (x >> (32 - 11))".
> - spam() is inlinable, just like foo() and unlike bar().
> - It doesn't rely on compiler optimizations, that sometimes are not present.
> If the CPU supports the rol, the compiler doesn't need pattern matching, it
> just spits out a rol. DMD currently has such optimization, but apparently
> here it's not working, I don't know why. For such basic operation, that is
> built in many CPUs there is no need for compiler optimizations.
>
> Bye,
> bearophile


In the gdc-4.6 package you have there, it's only naked asm that can't
be inlined.   However it is worth noting that DIASM is no longer in
mainline gdc.

Thanks,
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: How mutable is immutable?

2012-10-18 Thread Don Clugston

On 17/10/12 18:02, Timon Gehr wrote:

On 10/17/2012 01:49 PM, Don Clugston wrote:

On 01/01/12 13:50, Timon Gehr wrote:

On 01/01/2012 10:40 AM, Denis Shelomovskij wrote:

So, I'm a function `f`, I have an `immutable(type)[]` argument and I
want to store it for my friend `g` in an TLS variable `v`:
---
string v;
debug string sure;

void f(string s) { v = s; debug sure = s.idup; }
void g() { assert(v == sure); }
---
I also store a copy of `s` into `sure` for my friend to ensure
immutable
date hasn't been mutated.
Can my friend's assertion ever fail without breaking a type-system?
Sure. Just consider this:
---
void main() {
auto s = "abba".idup;
f(s);
delete s;
g();
}
---
Is it by-design? Looks like deleting immutable (and const because of
implicit conversion) data should be prohibited.
OK. Let `delete` be fixed. Can we still fail?
---
void h() {
immutable(char)[4] s = "abba";
f(s);
}
void main() {
h();
g();
}
---
Damn! So, what can we do with it? Not sure, but I have a proposal.

Fix it in language:
* disallow `delete` of const/immutable data
* disallow immutable data on the stack

This makes data really immutable if I don't miss something. Anyway, I
want `immutable` qualified data to be immutable without breaking a
type-system (if one do it, its his own responsibility), so some changes
should be made (IMHO).


You are using unsafe language features to break the type system. That is
not the fault of the type system.

'@safe:' at the top of the program should stop both examples from
working, it is a bug that it does not.


That's the point -- *which* checks are missing from @safe?


Escaping stack data and arbitrarily freeing memory are not operations
found in memory safe languages.


HOW do you propose to check for escaping stack data?


But I'm not sure that you're right, this looks broken to me, even
without @safe.

What does it mean to create immutable data on the stack? The stack is
intrinsically mutable!


So is the heap.


No it is not. Data on the stack *cannot* survive past the end of the 
function call. Data on the heap can last forever.



What does it mean to garbage collect immutable data?


From the point of view of the application, it doesn't happen. There are 
no observable semantics. It's merely an implementation detail.



What does it mean to allocate an 'int' on the stack?


What does it mean to delete immutable data?


Deallocate the storage for it and make it available for reuse.
Accessing it afterwards leads to arbitrary behaviour. This is the same
with mutable data. As the program may behave arbitrarily in this case,
it is valid behaviour to act as if immutable data changed.


No, you've broken the type system if you've deleted immutable data.
If I have a reference to an immutable variable, I have a guarantee that 
it will never change. delete will break that guarantee.


With a mutable variable, I have no such guarantee. (It's not safe to 
allocate something different in the deleted location, but it's OK to run 
the finalizer and then wipe all the memory).



I think it's reasonable for both of them to require a cast, even in
@system code.



The implementation of the 'scope' storage class should be fixed. We
could then require an unsafe cast(scope) to disable prevention of stack
address escaping.


No we can't. f cannot know that the string it has been given is on the 
stack. So main() must prevent it from being given to f() in the first 
place. How can it do that?


void foo(bool b, string y)
{
  immutable (char)[4] x = "abba";
  string s = b ? x : y;
  f(s);
}

Make it safe.



Rust's borrowed pointers may give some hints on how
to extend 'scope' to fields of structs.


I think it is more fundamental than that.


As to delete, delete is as unsafe when the involved data is immutable
as when it is mutable. Why require an additional cast in one case?


This is not about safety.
Modifying immutable data breaks the type system. Deleting mutable data 
does not. AFAIK it is safe to implement delete as a call to the 
finalizer, followed by setting the memory to T.init. Only the GC can 
determine if it is safe to reuse the memory.


Deleting immutable data just doesn't make sense.


Re: make install; where do .di files go?

2012-10-18 Thread Danni Coy
I was thinking of it as forward thinking way of doing things (taking my
queue from python for which multiple version do end up being installed on
most peoples systems).
I would make /usr/include/d1 for d version 1 files /usr/include/d2 for
version 2 /usr/include/d3 for version 3 etc.
/usr/include/d should be a symlink that points to what is concidered the
current version of the language.

I am not saying that it is the best way to do things but that is my
thinking.


Re: Regarding hex strings

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 02:47:42 UTC, H. S. Teoh wrote:

On Thu, Oct 18, 2012 at 02:45:10AM +0200, bearophile wrote:
[...]
hex strings are useful, but I think they were invented in D1 
when
strings were convertible to char[]. But today they are an 
array of

immutable UFT-8, so I think this default type is not so useful:

void main() {
string data1 = x"A1 B2 C3 D4"; // OK
immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
}


test.d(3): Error: cannot implicitly convert expression
("\xa1\xb2\xc3\xd4") of type string to ubyte[]

[...]

Yeah I think hex strings would be better as ubyte[] by default.

More generally, though, I think *both* of the above lines 
should be

equally accepted.  If you write x"A1 B2 C3" in the context of
initializing a string, then the compiler should infer the type 
of the
literal as string, and if the same literal occurs in the 
context of,
say, passing a ubyte[], then its type should be inferred as 
ubyte[], NOT

string.


T


IMO, this is a redundant feature that complicates the language 
for no benefit and should be deprecated.
strings already have an escape sequence for specifying 
code-points "\u" and for ubyte arrays you can simply use:

immutable(ubyte)[] data2 = [0xA1 0xB2 0xC3 0xD4];

So basically this feature gains us nothing.



Re: make install; where do .di files go?

2012-10-18 Thread David Nadlinger

On Thursday, 18 October 2012 at 07:29:45 UTC, Manu wrote:
and 2) I don't really care where it is, I would just like an 
answer that is
agree'd by the various compilers, and which is included in each 
of their
search paths by default. How can one make a build script for 
their apps

when it's not consistent where libraries are to be found?


Just an aside: During the whole discussion, please don't forget 
that you also need to compile your library separately for each 
compiler, as long as they are not ABI compatible (which is still 
a long way out). Thus, while I don't think a standard import path 
wouldn't be nice to have, the issue doesn't stand out as a single 
big problem as it might seem at first.


David


Re: Bits rotations

2012-10-18 Thread Iain Buclaw
On 18 October 2012 09:27, bearophile  wrote:
> Iain Buclaw:
>
>
>> In the gdc-4.6 package you have there, it's only naked asm that can't be
>> inlined.
>
>
> Good.
>
>
>
>> However it is worth noting that DIASM is no longer in mainline gdc.
>
>
> What's DIASM? Is it the D syntax for asm code? If this is right, then gdc
> developers have done a mistake, reducing D code interoperability, creating
> an incompatibility where there wasn't (and reducing my desire to use gdc or
> to switch to it, because I have hundreds of lines of inlined asm in my D
> code), this means doing the opposite of what generally compiler writers are
> supposed to do (maybe this topic was discussed already, in past).
>
> Bye,
> bearophile


This topic has been discussed in the past.  And the current status is
that GCC mainline has poisoned the frontend to use certain headers
that the IASM implementation in GDC depended on.

Example:

int zz(int p1)
{
  asm {
naked;
mov EAX, p1[EBP];
  }
}


To calculate p1[EBP], one would have to know where p1 will land on the
frame pointer to replace it with the relavant offset value.  This
would mean from the front-end we would have to invoke the back-end to
generate and tell us the stack frame layout of zz, which is not
possible because:

a) Invoking this before the optimisation passes may produce a
different result to what that actual result is after the optimisation
passes.
b) All functions are sitting in poisoned (for the front-end) headers.

There is an opportunity to defer parsing IASM until the GIMPLE
(middle-end) stage, however am still unable to retrieve the required
information to produce the correct codegen.


Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 08:58:57 UTC, foobar wrote:


IMO, this is a redundant feature that complicates the language 
for no benefit and should be deprecated.
strings already have an escape sequence for specifying 
code-points "\u" and for ubyte arrays you can simply use:

immutable(ubyte)[] data2 = [0xA1 0xB2 0xC3 0xD4];

So basically this feature gains us nothing.


Have you actually ever written code that requires using code 
points? This feature is a *huge* convenience for when you do. 
Just compare:


string nihongo1 = x"e697a5 e69cac e8aa9e";
string nihongo2 = "\ue697a5\ue69cac\ue8aa9e";
ubyte[] nihongo3 = [0xe6, 0x97, 0xa5, 0xe6, 0x9c, 0xac, 0xe8, 
0xaa, 0x9e];


BTW, your data2 doesn't compile.


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 00:45:12 UTC, bearophile wrote:

(Repost)

hex strings are useful, but I think they were invented in D1 
when strings were convertible to char[]. But today they are an 
array of immutable UFT-8, so I think this default type is not 
so useful:


void main() {
string data1 = x"A1 B2 C3 D4"; // OK
immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
}


test.d(3): Error: cannot implicitly convert expression 
("\xa1\xb2\xc3\xd4") of type string to ubyte[]


[SNIP]

Bye,
bearophile


The conversion can't be done *implicitly*, but you can still get 
your code to compile:


//
void main() {
immutable(ubyte)[] data2 =
cast(immutable(ubyte)[]) x"A1 B2 C3 D4"; // OK!
}
//

It's a bit ugly, and I agree it should work natively, but it is a 
workaround.


Re: Const ref and rvalues again...

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 06:11:26 UTC, monarch_dodra wrote:
On Thursday, 18 October 2012 at 04:30:17 UTC, Jonathan M Davis 
wrote:

On Thursday, October 18, 2012 06:24:08 jerro wrote:

What would be the problem with const ref taking rvalues?


Read the thread that I already linked to:

http://forum.dlang.org/thread/4f84d6dd.5090...@digitalmars.com

- Jonathan M Davis


I read the thread, and not a single one of the "problematic 
cases" are actually valid C++.


Yes: the faulty MSVC has taught people to do retarded things, 
or be afraid of things that were illegal to begin with (in 
particular, pass an rvalue to a ref, WHICH IS ILLEGAL IN C++), 
such as "increment(5)".


There is actually nothing wrong with creating a temporary when 
something is bound to a const ref, provided the compiler 
follows the rules:


*Only LValues with an EXACT type match may be passed to a 
reference.
*In regards to *const* references, RValues may be copied in a 
temporary, and that temporary bound the the ref.


I'm not saying we particularly *need* this in D (C++ has a "by 
ref" paradigm that makes it more important, but D *rarelly* 
ever passes by const ref).


But if the compiler respects the above two rules (which it 
should), then RValue to const ref is both perfectly doable and 
safe (as safe as refs get anyways).


By allowing the the C++ semantics the function looses semantic 
information - whether the actual parameter was lvalue or rvalue. 
This semantic info can be used bot for compiler optimizations and 
move semantics. This is the reason C++11 added && references.


General question (might not be relevant to current design of D):
How about leaving the decision to the compiler and let the 
programmer only specify usage intent?

E.g.: (I'm speaking semantics here, not syntax)
void foo(const Type t); // 1. I only read the value
void foo (mutate Type t); // 2. I want to also mutate the actual 
parameter

void foo (move Type t); // 3. I want to move the actual parameter

In case 1 above, the compiler is free to pass lvalues by const& 
and rvalues by value or perhaps optimize above certain size to 
const& too.
In case 2, the compiler passes a ref to lvalue, rvalues are not 
accepted at CT.
If I want move semantics, I can use option 3 which accepts 
rvalues by ref. btw, what's the correct semantics for lvalues 
here?


What do you think?


Re: make install; where do .di files go?

2012-10-18 Thread Jordi Sayol
Al 18/10/12 11:21, En/na David Nadlinger ha escrit:
> On Thursday, 18 October 2012 at 07:29:45 UTC, Manu wrote:
>> and 2) I don't really care where it is, I would just like an answer that is
>> agree'd by the various compilers, and which is included in each of their
>> search paths by default. How can one make a build script for their apps
>> when it's not consistent where libraries are to be found?
> 
> Just an aside: During the whole discussion, please don't forget that you also 
> need to compile your library separately for each compiler, as long as they 
> are not ABI compatible (which is still a long way out). Thus, while I don't 
> think a standard import path wouldn't be nice to have, the issue doesn't 
> stand out as a single big problem as it might seem at first.
> 
> David
> 

+1

-- 
Jordi Sayol


Re: Regarding hex strings

2012-10-18 Thread bearophile

The docs say:
http://dlang.org/lex.html

Hex strings allow string literals to be created using hex data. 
The hex data need not form valid UTF characters.<


But this code:


void main() {
immutable ubyte[4] data = x"F9 04 C1 E2";
}



Gives me:

temp.d(2): Error: Outside Unicode code space

Are the docs correct?

--

foobar:

Seems to me this is in the same ballpark as the built-in 
complex numbers. Sure it's nice to be able to write "4+5i" 
instead of "complex(4,5)" but how frequently do you actually 
ever need the _literals_ even in complex computational heavy 
code?


Compared to "oct!5151151511", one problem with code like this is 
that binary blobs are sometimes large, so supporting a x"" syntax 
is better:


immutable ubyte[4] data = hex!"F9 04 C1 E2";

Bye,
bearophile


Re: Regarding hex strings

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 09:42:43 UTC, monarch_dodra wrote:

On Thursday, 18 October 2012 at 08:58:57 UTC, foobar wrote:


IMO, this is a redundant feature that complicates the language 
for no benefit and should be deprecated.
strings already have an escape sequence for specifying 
code-points "\u" and for ubyte arrays you can simply use:

immutable(ubyte)[] data2 = [0xA1 0xB2 0xC3 0xD4];

So basically this feature gains us nothing.


Have you actually ever written code that requires using code 
points? This feature is a *huge* convenience for when you do. 
Just compare:


string nihongo1 = x"e697a5 e69cac e8aa9e";
string nihongo2 = "\ue697a5\ue69cac\ue8aa9e";
ubyte[] nihongo3 = [0xe6, 0x97, 0xa5, 0xe6, 0x9c, 0xac, 0xe8, 
0xaa, 0x9e];


BTW, your data2 doesn't compile.


I didn't try to compile it :) I just rewrote berophile's example 
with 0x prefixes.


How often do you actually need to write code-point _literals_ in 
your code?
I'm not arguing that it isn't convenient. My question would be 
rather Anderi's "does it pull it's own weight?" meaning does the 
added complexity in the language and having more than one way for 
doing something worth that convenience?


Seems to me this is in the same ballpark as the built-in complex 
numbers. Sure it's nice to be able to write "4+5i" instead of 
"complex(4,5)" but how frequently do you actually ever need the 
_literals_ even in complex computational heavy code?


Re: Regarding hex strings

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 10:05:06 UTC, bearophile wrote:

The docs say:
http://dlang.org/lex.html

Hex strings allow string literals to be created using hex data. 
The hex data need not form valid UTF characters.<


But this code:


void main() {
immutable ubyte[4] data = x"F9 04 C1 E2";
}



Gives me:

temp.d(2): Error: Outside Unicode code space

Are the docs correct?

--

foobar:

Seems to me this is in the same ballpark as the built-in 
complex numbers. Sure it's nice to be able to write "4+5i" 
instead of "complex(4,5)" but how frequently do you actually 
ever need the _literals_ even in complex computational heavy 
code?


Compared to "oct!5151151511", one problem with code like this 
is that binary blobs are sometimes large, so supporting a x"" 
syntax is better:


immutable ubyte[4] data = hex!"F9 04 C1 E2";

Bye,
bearophile


How often large binary blobs are literally spelled in the source 
code (as opposed to just being read from a file)?
In any case, I'm not opposed to such a utility library, in fact I 
think it's a rather good idea and we already have a precedent 
with "oct!"
I just don't think this belongs as a built-in feature in the 
language.


Re: Regarding hex strings

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 10:11:14 UTC, foobar wrote:

On Thursday, 18 October 2012 at 10:05:06 UTC, bearophile wrote:

The docs say:
http://dlang.org/lex.html

Hex strings allow string literals to be created using hex 
data. The hex data need not form valid UTF characters.<




This is especially a good reason to remove this feature as it 
breaks the principle of least surprise and I consider it a major 
bug, not a feature.


I expect D's strings which are by definition Unicode to _only_ 
ever allow _valid_ Unicode. It makes no sense what so ever to 
allow this nasty back-door. Other text encoding should be either 
stored and treated as binary data (ubyte[]) or better yet stored 
in their own types that will ensure those encodings' invariants.


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 10:17:06 UTC, foobar wrote:

On Thursday, 18 October 2012 at 10:11:14 UTC, foobar wrote:

On Thursday, 18 October 2012 at 10:05:06 UTC, bearophile wrote:

The docs say:
http://dlang.org/lex.html

Hex strings allow string literals to be created using hex 
data. The hex data need not form valid UTF characters.<




This is especially a good reason to remove this feature as it 
breaks the principle of least surprise and I consider it a 
major bug, not a feature.


I expect D's strings which are by definition Unicode to _only_ 
ever allow _valid_ Unicode. It makes no sense what so ever to 
allow this nasty back-door. Other text encoding should be 
either stored and treated as binary data (ubyte[]) or better 
yet stored in their own types that will ensure those encodings' 
invariants.


Yeah, that makes sense too. I'll try to toy around on my end and 
see if I can write an "hex".


Re: make install; where do .di files go?

2012-10-18 Thread Manu
On 18 October 2012 12:21, David Nadlinger  wrote:

> On Thursday, 18 October 2012 at 07:29:45 UTC, Manu wrote:
>
>> and 2) I don't really care where it is, I would just like an answer that
>> is
>> agree'd by the various compilers, and which is included in each of their
>> search paths by default. How can one make a build script for their apps
>> when it's not consistent where libraries are to be found?
>>
>
> Just an aside: During the whole discussion, please don't forget that you
> also need to compile your library separately for each compiler, as long as
> they are not ABI compatible (which is still a long way out). Thus, while I
> don't think a standard import path wouldn't be nice to have, the issue
> doesn't stand out as a single big problem as it might seem at first.


Really? The D compilers aren't ABI compatible on Linux?
Good to know. Cheers!

Well, for my own purposes, I only intend to interact with bindings for C
libs, which, fortunately, are universally binary compatible :)


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 10:39:46 UTC, monarch_dodra wrote:


Yeah, that makes sense too. I'll try to toy around on my end 
and see if I can write an "hex".


That was actually relatively easy!

Here is some usecase:

//
void main()
{
enum a = hex!"01 ff 7f";
enum b = hex!0x01_ff_7f;
ubyte[] c = hex!"0123456789abcdef";
immutable(ubyte)[] bearophile1 = hex!"A1 B2 C3 D4";
immutable(ubyte)[] bearophile2 = hex!0xA1_B2_C3_D4;

a.writeln();
b.writeln();
c.writeln();
bearophile1.writeln();
bearophile2.writeln();
}
//

And corresponding output:

//
[1, 255, 127]
[1, 255, 127]
[1, 35, 69, 103, 137, 171, 205, 239]
[161, 178, 195, 212]
[161, 178, 195, 212]
//

hex! was a very good idea actually, imo. I'll post my current 
impl in the next post.


That said, I don't know if I'd deprecate x"", as it serves a 
different role, as you have already pointed out, in that it 
*will* validate the code points.


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 11:24:04 UTC, monarch_dodra wrote:
hex! was a very good idea actually, imo. I'll post my current 
impl in the next post.




//
import std.stdio;
import std.conv;
import std.ascii;


template hex(string s)
{
enum hex = decode(s);
}


template hex(ulong ul)
{
enum hex = decode(ul);
}

ubyte[] decode(string s)
{
ubyte[] ret;
size_t p;
while(p < s.length)
{
while( s[p] == ' ' || s[p] == '_' )
{
++p;
if (p == s.length) assert(0, text("Premature end of 
string at index ", p, "."));;

}

char c1 = s[p];
if (!std.ascii.isHexDigit(c1)) assert(0, text("Unexpected 
character ", c1, " at index ", p, "."));

c1 = cast(char)std.ascii.toUpper(c1);

++p;
if (p == s.length) assert(0, text("Premature end of 
string after ", c1, "."));


char c2 = s[p];
if (!std.ascii.isHexDigit(c2)) assert(0, text("Unexpected 
character ", c2, " at index ", p, "."));

c2 = cast(char)std.ascii.toUpper(c2);
++p;


ubyte val;
if('0' <= c2 && c2 <= '9') val += (c2 - '0');
if('A' <= c2 && c2 <= 'F') val += (c2 - 'A' + 10);
if('0' <= c1 && c1 <= '9') val += ((c1 - '0')*16);
if('A' <= c1 && c1 <= 'F') val += ((c1 - 'A' + 10)*16);
ret ~= val;
}
return ret;
}

ubyte[] decode(ulong ul)
{
//NOTE: This is not efficinet AT ALL (push front)
//but it is ctfe, so we can live it for now ^^
//I'll optimize it if I try to push it
ubyte[] ret;
while(ul)
{
ubyte t = ul%256;
ret = t ~ ret;
ul /= 256;
}
return ret;
}
//

NOT a final version.


Re: make install; where do .di files go?

2012-10-18 Thread Jacob Carlborg

On 2012-10-18 13:09, Manu wrote:


Really? The D compilers aren't ABI compatible on Linux?
Good to know. Cheers!


They all use different runtimes and at least GDC won't implement the D 
calling convention. GDC uses the calling convention of the backend. GDC 
also don't implement DMD inline assembly.


--
/Jacob Carlborg


Re: How mutable is immutable?

2012-10-18 Thread Artur Skawina
On 10/18/12 10:08, Don Clugston wrote:
> On 17/10/12 18:02, Timon Gehr wrote:
>> On 10/17/2012 01:49 PM, Don Clugston wrote:
>>> On 01/01/12 13:50, Timon Gehr wrote:
 On 01/01/2012 10:40 AM, Denis Shelomovskij wrote:
> So, I'm a function `f`, I have an `immutable(type)[]` argument and I
> want to store it for my friend `g` in an TLS variable `v`:
> ---
> string v;
> debug string sure;
>
> void f(string s) { v = s; debug sure = s.idup; }
> void g() { assert(v == sure); }
> ---
> I also store a copy of `s` into `sure` for my friend to ensure
> immutable
> date hasn't been mutated.
> Can my friend's assertion ever fail without breaking a type-system?
> Sure. Just consider this:
> ---
> void main() {
> auto s = "abba".idup;
> f(s);
> delete s;
> g();
> }
> ---
> Is it by-design? Looks like deleting immutable (and const because of
> implicit conversion) data should be prohibited.
> OK. Let `delete` be fixed. Can we still fail?
> ---
> void h() {
> immutable(char)[4] s = "abba";
> f(s);
> }
> void main() {
> h();
> g();
> }
> ---
> Damn! So, what can we do with it? Not sure, but I have a proposal.
>
> Fix it in language:
> * disallow `delete` of const/immutable data
> * disallow immutable data on the stack
>
> This makes data really immutable if I don't miss something. Anyway, I
> want `immutable` qualified data to be immutable without breaking a
> type-system (if one do it, its his own responsibility), so some changes
> should be made (IMHO).

 You are using unsafe language features to break the type system. That is
 not the fault of the type system.

 '@safe:' at the top of the program should stop both examples from
 working, it is a bug that it does not.
>>>
>>> That's the point -- *which* checks are missing from @safe?
>>
>> Escaping stack data and arbitrarily freeing memory are not operations
>> found in memory safe languages.
> 
> HOW do you propose to check for escaping stack data?

/How/ is not a problem (ignoring implementation costs), the /language 
definition/
part is trickier - you need a very precise definition of what is allowed and 
what
isn't; otherwise different compilers will make different decisions and every
compiler will support only a vendor-specific non-std dialect...
(eg storing a scoped-ref into some kind of container, passing that down to other
functions could work, but what if you then need to let the container escape and
want to do that by removing the scoped-ref? It might be possible for the 
compiler
to prove that it's safe, but it's unlikely that every compiler will act the 
same)

>>> But I'm not sure that you're right, this looks broken to me, even
>>> without @safe.
>>>
>>> What does it mean to create immutable data on the stack? The stack is
>>> intrinsically mutable!
>>
>> So is the heap.
> 
> No it is not. Data on the stack *cannot* survive past the end of the function 
> call. Data on the heap can last forever.

Lifetime and mutability are different things.

>> What does it mean to garbage collect immutable data?
> 
> From the point of view of the application, it doesn't happen. There are no 
> observable semantics. It's merely an implementation detail.
> 
>> What does it mean to allocate an 'int' on the stack?
>>
>>> What does it mean to delete immutable data?
>>
>> Deallocate the storage for it and make it available for reuse.
>> Accessing it afterwards leads to arbitrary behaviour. This is the same
>> with mutable data. As the program may behave arbitrarily in this case,
>> it is valid behaviour to act as if immutable data changed.
> 
> No, you've broken the type system if you've deleted immutable data.
> If I have a reference to an immutable variable, I have a guarantee that it 
> will never change. delete will break that guarantee.

Yes. The alternative (to allow explicit delete on immutable data) would likely
be too complicated to be worth implementing in the near future - you need to
ensure the data is unique, there are no other refs to it, and forbid accessing
it after the 'delete' op. I guess an easy way out would be to ask the GC to run
a collect cycle and return back whether an object was successfully collected.
But i can't really see a useful application for it, and you'd need a special
convention, as the GC would have to given the last ref to the object.


> With a mutable variable, I have no such guarantee. (It's not safe to allocate 
> something different in the deleted location, but it's OK to run the finalizer 
> and then wipe all the memory).
> 
>>> I think it's reasonable for both of them to require a cast, even in
>>> @system code.
>>>
>>
>> The implementation of the 'scope' storage class should be fixed. We
>> could then require an unsafe cast(scope) to disable prevention of stack
>> address escaping.
> 
> No we can't. f c

Re: make install; where do .di files go?

2012-10-18 Thread Manu
On 18 October 2012 14:36, Jacob Carlborg  wrote:

> On 2012-10-18 13:09, Manu wrote:
>
>  Really? The D compilers aren't ABI compatible on Linux?
>> Good to know. Cheers!
>>
>
> They all use different runtimes and at least GDC won't implement the D
> calling convention. GDC uses the calling convention of the backend. GDC
> also don't implement DMD inline assembly.


What's distinct about the D calling convention?


[OT] Re: More D & Rust

2012-10-18 Thread Nick Treleaven

On 17/10/2012 18:14, Michael wrote:

I can't compile even hello world on both Win 7 and Win XP.

rust 0.4, latest mingw.


The announcement seems to suggest you might need an older mingw:
https://mail.mozilla.org/pipermail/rust-dev/2012-October/002489.html

Also if you had 0.3 you need to uninstall it before installing 0.4.


Re: Import improvement

2012-10-18 Thread Nick Treleaven

On 15/10/2012 14:02, Peter Alexander wrote:

You could use something like this:

import std.(stdio, xml, algorithm);

Of course, there's many variations (square brackets, curly braces, no
dot, no commas...) but it's all bikeshedding.


Personally I like:

import package std : stdio, xml, algorithm;

compare with:

import std.algorithm : sort, swap;

This has nice symmetry and is unambiguous.


Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 11:26:13 UTC, monarch_dodra wrote:


NOT a final version.


With correct-er utf string support. In theory, non-ascii 
characters are illegal, but it makes for safer code, and better 
diagnosis.


//
ubyte[] decode(string s)
{
ubyte[] ret;;
while(s.length)
{
while( s.front == ' ' || s.front == '_' )
{
s.popFront();
if (!s.length) assert(0, text("Premature end of 
string."));;

}

dchar c1 = s.front;
if (!std.ascii.isHexDigit(c1)) assert(0, text("Unexpected 
character ", c1, "."));

c1 = std.ascii.toUpper(c1);

s.popFront();
if (!s.length) assert(0, text("Premature end of string 
after ", c1, "."));


dchar c2 = s.front;
if (!std.ascii.isHexDigit(c2)) assert(0, text("Unexpected 
character ", c2, " after ", c1, "."));

c2 = std.ascii.toUpper(c2);
s.popFront();

ubyte val;
if('0' <= c2 && c2 <= '9') val += (c2 - '0');
if('A' <= c2 && c2 <= 'F') val += (c2 - 'A' + 10);
if('0' <= c1 && c1 <= '9') val += ((c1 - '0')*16);
if('A' <= c1 && c1 <= 'F') val += ((c1 - 'A' + 10)*16);
ret ~= val;
}
return ret;
}
//


Re: make install; where do .di files go?

2012-10-18 Thread Iain Buclaw
On 18 October 2012 13:24, Manu  wrote:
> On 18 October 2012 14:36, Jacob Carlborg  wrote:
>>
>> On 2012-10-18 13:09, Manu wrote:
>>
>>> Really? The D compilers aren't ABI compatible on Linux?
>>> Good to know. Cheers!
>>
>>
>> They all use different runtimes and at least GDC won't implement the D
>> calling convention. GDC uses the calling convention of the backend. GDC also
>> don't implement DMD inline assembly.
>
>
> What's distinct about the D calling convention?

It distinctly sox. ;-)

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: Regarding hex strings

2012-10-18 Thread bearophile

monarch_dodra:


hex! was a very good idea actually, imo.


It must scale up to "real world" usages. Try it with a program
composed of 3 modules each one containing a 100 KB long string.
Then try it with a program with two hundred of medium sized
literals, and let's see compilation times and binary sizes.

Bye,
bearophile


Re: Account on ARM/Debian

2012-10-18 Thread Alix Pexton

On 16/10/2012 12:11, Alix Pexton wrote:

On 16/10/2012 11:48, Iain Buclaw wrote:

Do you need a static IP set on the firewall to allow connections in?



I don't think so, will have to throw that one over to my brother to be
sure ^^



Make that a maybe!

After being on-line for less than 24 hours, the poor RasPi was being 
penetration tested from IPs as far apart as Korea and Sussex!


I'm still willing to let my RasPi be used for this fine cause, but my 
Brother wants to investigate why his IP has gotten interest from foreign 
powers, and reconfigure the firewall before it gets reconnected.


Aside from Ian, are there any other GDC/ARM developers (who might want 
access)? I'm considering making individual accounts for each user rather 
than just giving out the password to the "pi" user.


A...


Re: Account on ARM/Debian

2012-10-18 Thread Joseph Rushton Wakeling

On 10/18/2012 03:18 PM, Alix Pexton wrote:

I'm considering making individual accounts for each user rather than just giving
out the password to the "pi" user.


TBH that "pi" account seems like a massive security vulnerability for any RasPi 
that is open to remote login.  Yes, you can change the password, but I'd be 
inclined to remove it and set up an administrator account with a completely 
different name ...


Re: Account on ARM/Debian

2012-10-18 Thread Iain Buclaw
On 18 October 2012 14:18, Alix Pexton  wrote:
> On 16/10/2012 12:11, Alix Pexton wrote:
>>
>> On 16/10/2012 11:48, Iain Buclaw wrote:
>>>
>>> Do you need a static IP set on the firewall to allow connections in?
>>>
>>
>> I don't think so, will have to throw that one over to my brother to be
>> sure ^^
>
>
>
> Make that a maybe!
>
> After being on-line for less than 24 hours, the poor RasPi was being
> penetration tested from IPs as far apart as Korea and Sussex!
>

I'm in Sussex, it was probably me. :-p


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: Account on ARM/Debian

2012-10-18 Thread Joseph Rushton Wakeling

On 10/17/2012 06:37 PM, David Nadlinger wrote:

Well, that depends on your definition of stable. LDC Git master is supposed
always pass the CI tests, i.e. the DMD, druntime and Phobos test suites.


Which would explain my (limited, as I haven't been doing any development 
recently) experience of it seeming very stable and effective when built (and 
rebuilt) from git sources. :-)


Have any optimization improvements landed recently?  It really seems like since 
my last email on the subject, the speed of executables has improved to be about 
the same as GDC.



A possible extension of that would be to have a separate »stable« Git branch
which is automatically advanced along with master by the CI system whenever a
given revision passes all the tests. If somebody wants to set up a system like
this, I'd be happy to officially adopt it.


That's not what I was really thinking about -- I trust you and the other LDC 
devs to make sure things pass all the automated tests before merging into master.



But in my experience, anything more than that, i.e. declaring revisions stable
based on criteria which can't be evaluated by an automatic test suite, is not
worth it, at least for smallish projects like LDC. Judging whether a given state
is stable by hand is notoriously hard to get right, and the reason we have beta
phases before releases, etc.


What I had in mind was that you might define a "stable" branch which is updated 
according to certain new-feature milestones, and which in the interim between 
those milestones only receives bugfixes, not new features.


I guess the benefits of doing this depend on the extent to which you have a 
well-defined roadmap which would let you define a "milestone", though it might 
be possible to do it on the basis of the frontend/druntime/phobos version.


It probably seems not-worth-doing unless you are making official releases 
anyway, but from my point of view as a "consumer" of LDC it feels like a nice 
option to be able to have a branch that is updated more slowly -- i.e. with 
material that isn't just the latest patches, but that has been around for a 
while so that the devs have had time to spot any holes.


Re: Regarding hex strings

2012-10-18 Thread Kagamin

On Thursday, 18 October 2012 at 09:42:43 UTC, monarch_dodra wrote:
Have you actually ever written code that requires using code 
points? This feature is a *huge* convenience for when you do. 
Just compare:


string nihongo1 = x"e697a5 e69cac e8aa9e";
string nihongo2 = "\ue697a5\ue69cac\ue8aa9e";
ubyte[] nihongo3 = [0xe6, 0x97, 0xa5, 0xe6, 0x9c, 0xac, 0xe8, 
0xaa, 0x9e];


You should use unicode directly here, that's the whole point to 
support it.

string nihongo = "日本語";


Re: Bits rotations

2012-10-18 Thread Don Clugston

On 18/10/12 11:39, Iain Buclaw wrote:

On 18 October 2012 09:27, bearophile  wrote:

Iain Buclaw:



In the gdc-4.6 package you have there, it's only naked asm that can't be
inlined.



Good.




However it is worth noting that DIASM is no longer in mainline gdc.



What's DIASM? Is it the D syntax for asm code? If this is right, then gdc
developers have done a mistake, reducing D code interoperability, creating
an incompatibility where there wasn't (and reducing my desire to use gdc or
to switch to it, because I have hundreds of lines of inlined asm in my D
code), this means doing the opposite of what generally compiler writers are
supposed to do (maybe this topic was discussed already, in past).

Bye,
bearophile



This topic has been discussed in the past.  And the current status is
that GCC mainline has poisoned the frontend to use certain headers
that the IASM implementation in GDC depended on.

Example:

int zz(int p1)
{
   asm {
 naked;
 mov EAX, p1[EBP];
   }
}


To calculate p1[EBP], one would have to know where p1 will land on the
frame pointer to replace it with the relavant offset value.  This
would mean from the front-end we would have to invoke the back-end to
generate and tell us the stack frame layout of zz, which is not
possible because:


FYI: That code doesn't work in DMD either.
DMD assumes a frame pointer is created in naked ASM, which is totally 
wrong. Code like that should not compile. The compiler does not know 
what the correct offsets are and should not attempt to try.



a) Invoking this before the optimisation passes may produce a
different result to what that actual result is after the optimisation
passes.
b) All functions are sitting in poisoned (for the front-end) headers.

There is an opportunity to defer parsing IASM until the GIMPLE
(middle-end) stage, however am still unable to retrieve the required
information to produce the correct codegen.


Are you just talking about naked asm? Conceptually naked asm should act 
as if it was created in an assembler in a seperate obj file, and 
accessed via extern(C).

If you have problems with non-naked asm, that would make more sense to me.



Re: Regarding hex strings

2012-10-18 Thread Don Clugston

On 18/10/12 10:58, foobar wrote:

On Thursday, 18 October 2012 at 02:47:42 UTC, H. S. Teoh wrote:

On Thu, Oct 18, 2012 at 02:45:10AM +0200, bearophile wrote:
[...]

hex strings are useful, but I think they were invented in D1 when
strings were convertible to char[]. But today they are an array of
immutable UFT-8, so I think this default type is not so useful:

void main() {
string data1 = x"A1 B2 C3 D4"; // OK
immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
}


test.d(3): Error: cannot implicitly convert expression
("\xa1\xb2\xc3\xd4") of type string to ubyte[]

[...]

Yeah I think hex strings would be better as ubyte[] by default.

More generally, though, I think *both* of the above lines should be
equally accepted.  If you write x"A1 B2 C3" in the context of
initializing a string, then the compiler should infer the type of the
literal as string, and if the same literal occurs in the context of,
say, passing a ubyte[], then its type should be inferred as ubyte[], NOT
string.


T


IMO, this is a redundant feature that complicates the language for no
benefit and should be deprecated.
strings already have an escape sequence for specifying code-points "\u"
and for ubyte arrays you can simply use:
immutable(ubyte)[] data2 = [0xA1 0xB2 0xC3 0xD4];

So basically this feature gains us nothing.


That is not the same. Array literals are not the same as string 
literals, they have an implicit .dup.
See my recent thread on this issue (which unfortunately seems have to 
died without a resolution, people got hung up about trailing null 
characters without apparently noticing the more important issue of the dup).




Re: Regarding hex strings

2012-10-18 Thread monarch_dodra

On Thursday, 18 October 2012 at 13:15:55 UTC, bearophile wrote:

monarch_dodra:


hex! was a very good idea actually, imo.


It must scale up to "real world" usages. Try it with a program
composed of 3 modules each one containing a 100 KB long string.
Then try it with a program with two hundred of medium sized
literals, and let's see compilation times and binary sizes.

Bye,
bearophile


Hum... The compilation is pretty fast actually, about 1 second, 
provided it doesn't choke.


It works for strings up to a length of 400 lines @ 80 chars per 
line, which result to approximately 16K of data. After that, I 
get a DMD out of memory error.


DMD memory usage spikes quite quickly. To compile those 400 lines 
(16K), I use 800MB of memory (!). If I reach about 1GB, then it 
crashes.


I tried using a refAppender instead of ret~, but that changed 
nothing.


Kind of weird it would use that much memory though...

Also, the memory doesn't get released. I can parse a 1x400 Line 
string, but if I try to parse 3 of them, DMD will choke on the 
second one. :(


Re: Account on ARM/Debian

2012-10-18 Thread Alix Pexton

On 18/10/2012 14:32, Joseph Rushton Wakeling wrote:

On 10/18/2012 03:18 PM, Alix Pexton wrote:

I'm considering making individual accounts for each user rather than
just giving
out the password to the "pi" user.


TBH that "pi" account seems like a massive security vulnerability for
any RasPi that is open to remote login.  Yes, you can change the
password, but I'd be inclined to remove it and set up an administrator
account with a completely different name ...


Any advice/instruction (the clearer the better) on how to setup my RasPi 
so that it is more secure are very welcome ^^


I'm also looking for a smallish USB hard drive to attach as a swap-drive 
so that there is scope to compile GDC, everything I have seen so far 
costs more than the RasPi did and does not fit my definition of "smallish".


A...


Re: Account on ARM/Debian

2012-10-18 Thread jerro
On Thursday, 18 October 2012 at 13:33:02 UTC, Joseph Rushton 
Wakeling wrote:

On 10/18/2012 03:18 PM, Alix Pexton wrote:
I'm considering making individual accounts for each user 
rather than just giving

out the password to the "pi" user.


TBH that "pi" account seems like a massive security 
vulnerability for any RasPi that is open to remote login.  Yes, 
you can change the password, but I'd be inclined to remove it 
and set up an administrator account with a completely different 
name ...


You could also add an AllowUsers setting to /etc/ssh/sshd_config 
and not include the pi user in it.


Re: Bits rotations

2012-10-18 Thread Iain Buclaw
On 18 October 2012 15:22, Don Clugston  wrote:
> On 18/10/12 11:39, Iain Buclaw wrote:
>>
>> On 18 October 2012 09:27, bearophile  wrote:
>>>
>>> Iain Buclaw:
>>>
>>>
 In the gdc-4.6 package you have there, it's only naked asm that can't be
 inlined.
>>>
>>>
>>>
>>> Good.
>>>
>>>
>>>
 However it is worth noting that DIASM is no longer in mainline gdc.
>>>
>>>
>>>
>>> What's DIASM? Is it the D syntax for asm code? If this is right, then gdc
>>> developers have done a mistake, reducing D code interoperability,
>>> creating
>>> an incompatibility where there wasn't (and reducing my desire to use gdc
>>> or
>>> to switch to it, because I have hundreds of lines of inlined asm in my D
>>> code), this means doing the opposite of what generally compiler writers
>>> are
>>> supposed to do (maybe this topic was discussed already, in past).
>>>
>>> Bye,
>>> bearophile
>>
>>
>>
>> This topic has been discussed in the past.  And the current status is
>> that GCC mainline has poisoned the frontend to use certain headers
>> that the IASM implementation in GDC depended on.
>>
>> Example:
>>
>> int zz(int p1)
>> {
>>asm {
>>  naked;
>>  mov EAX, p1[EBP];
>>}
>> }
>>
>>
>> To calculate p1[EBP], one would have to know where p1 will land on the
>> frame pointer to replace it with the relavant offset value.  This
>> would mean from the front-end we would have to invoke the back-end to
>> generate and tell us the stack frame layout of zz, which is not
>> possible because:
>
>
> FYI: That code doesn't work in DMD either.
> DMD assumes a frame pointer is created in naked ASM, which is totally wrong.
> Code like that should not compile. The compiler does not know what the
> correct offsets are and should not attempt to try.
>
>
>> a) Invoking this before the optimisation passes may produce a
>> different result to what that actual result is after the optimisation
>> passes.
>> b) All functions are sitting in poisoned (for the front-end) headers.
>>
>> There is an opportunity to defer parsing IASM until the GIMPLE
>> (middle-end) stage, however am still unable to retrieve the required
>> information to produce the correct codegen.
>
>
> Are you just talking about naked asm? Conceptually naked asm should act as
> if it was created in an assembler in a seperate obj file, and accessed via
> extern(C).
> If you have problems with non-naked asm, that would make more sense to me.
>

Normal assembler... naked assembler has its own set of problems
(requires patching in a "naked" style attribute which the x86 GCC
maintainers rejected outrightly).


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: Tricky semantics of ranges & potentially numerous Phobos bugs

2012-10-18 Thread H. S. Teoh
On Thu, Oct 18, 2012 at 09:09:03AM +0200, Don Clugston wrote:
> On 17/10/12 23:41, H. S. Teoh wrote:
[...]
> >I think I'm not so sure about Andrei's lumping input ranges with
> >persistent return values from .front together with forward ranges.
> >Some algorithms, like findAdjacent, do not need a forward range, but
> >they do need a persistent .front. I do not like the idea of
> >artificially limiting the scope of findAdjacent just because you
> >can't assume input ranges' .front returns a persistent value. Like
> >somebody else mentioned, whether .front is transient or not is
> >orthogonal to whether the range is an input range or a forward range.
> >There can be ranges whose .front is persistent, but they can't be
> >forward ranges for practical reasons.
> 
> Is it actually orthogonal? Is it possible for a forward range to be
> transient?
[...]

What about a range over all permutations of an array, that modifies the
array in-place? It can be a forward range by having .save copy the
current state of the array, but .front is transient nonetheless.


T

-- 
Life is too short to run proprietary software. -- Bdale Garbee


Re: Regarding hex strings

2012-10-18 Thread foobar

On Thursday, 18 October 2012 at 14:29:57 UTC, Don Clugston wrote:

On 18/10/12 10:58, foobar wrote:

On Thursday, 18 October 2012 at 02:47:42 UTC, H. S. Teoh wrote:

On Thu, Oct 18, 2012 at 02:45:10AM +0200, bearophile wrote:
[...]
hex strings are useful, but I think they were invented in D1 
when
strings were convertible to char[]. But today they are an 
array of
immutable UFT-8, so I think this default type is not so 
useful:


void main() {
   string data1 = x"A1 B2 C3 D4"; // OK
   immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
}


test.d(3): Error: cannot implicitly convert expression
("\xa1\xb2\xc3\xd4") of type string to ubyte[]

[...]

Yeah I think hex strings would be better as ubyte[] by 
default.


More generally, though, I think *both* of the above lines 
should be

equally accepted.  If you write x"A1 B2 C3" in the context of
initializing a string, then the compiler should infer the 
type of the
literal as string, and if the same literal occurs in the 
context of,
say, passing a ubyte[], then its type should be inferred as 
ubyte[], NOT

string.


T


IMO, this is a redundant feature that complicates the language 
for no

benefit and should be deprecated.
strings already have an escape sequence for specifying 
code-points "\u"

and for ubyte arrays you can simply use:
immutable(ubyte)[] data2 = [0xA1 0xB2 0xC3 0xD4];

So basically this feature gains us nothing.


That is not the same. Array literals are not the same as string 
literals, they have an implicit .dup.
See my recent thread on this issue (which unfortunately seems 
have to died without a resolution, people got hung up about 
trailing null characters without apparently noticing the more 
important issue of the dup).


I don't see how that detail is relevant to this discussion as I 
was not arguing against string literals or array literals in 
general.


We can still have both (assuming the code points are valid...):
string foo = "\ua1\ub2\uc3"; // no .dup
and:
ubyte[3] goo = [0xa1, 0xb2, 0xc3]; // implicit .dup


Re: [OT] Re: More D & Rust

2012-10-18 Thread Michael

The announcement seems to suggest you might need an older mingw:
https://mail.mozilla.org/pipermail/rust-dev/2012-October/002489.html

Also if you had 0.3 you need to uninstall it before installing 
0.4.


Yes. I read it. It is annoying.



Re: How mutable is immutable?

2012-10-18 Thread Timon Gehr

On 10/18/2012 10:08 AM, Don Clugston wrote:

On 17/10/12 18:02, Timon Gehr wrote:

On 10/17/2012 01:49 PM, Don Clugston wrote:

...

That's the point -- *which* checks are missing from @safe?


Escaping stack data and arbitrarily freeing memory are not operations
found in memory safe languages.


HOW do you propose to check for escaping stack data?



Static escape analysis. Use the 'scope' qualifier to designate
data that is not allowed to be escaped in order to make it modular.


...


The implementation of the 'scope' storage class should be fixed. We
could then require an unsafe cast(scope) to disable prevention of stack
address escaping.


No we can't. f cannot know that the string it has been given is on the
stack. So main() must prevent it from being given to f() in the first
place. How can it do that?



f can know that it mustn't escape it, which is enough.


void foo(bool b, string y)
{
   immutable (char)[4] x = "abba";
   string s = b ? x : y;
   f(s);
}

Make it safe.



It is safe if the parameter to f is marked with 'scope'. (and this in
turn obliges f not to escape it.)

Analyze scope on the expression level.

The analysis would determine that x[] is 'scope'. It would
conservatively propagate this fact to (b ? x[] : y). Then the local
variable 's' will get the 'scope' storage class.

In general, use a fixed-point iteration to determine all local
variables that might refer to scope'd data and prevent that they get
escaped.




Rust's borrowed pointers may give some hints on how
to extend 'scope' to fields of structs.


I think it is more fundamental than that.


As to delete, delete is as unsafe when the involved data is immutable
as when it is mutable. Why require an additional cast in one case?


This is not about safety.
Modifying immutable data breaks the type system. Deleting mutable data
does not.
AFAIK it is safe to implement delete as a call to the
finalizer, followed by setting the memory to T.init.
...



Now I see where you are coming from. This is indeed a safe approach for
references to/arrays of fully mutable value types, but not for delete
in general.

Make sure to treat void* specially though.

struct S{ immutable int x; this(int x){this.x=x;}}

void main()@safe{
void* s = new S(2);
delete s;
}

Class instance memory does not have a T.init, because it is not
assigned a T. And even if it was, how would you know at compile time if
the bound instance has any immutable fields?
Should that be a runtime exception?



Re: Regarding hex strings

2012-10-18 Thread Jonathan M Davis
On Thursday, October 18, 2012 15:56:50 Kagamin wrote:
> On Thursday, 18 October 2012 at 09:42:43 UTC, monarch_dodra wrote:
> > Have you actually ever written code that requires using code
> > points? This feature is a *huge* convenience for when you do.
> > Just compare:
> > 
> > string nihongo1 = x"e697a5 e69cac e8aa9e";
> > string nihongo2 = "\ue697a5\ue69cac\ue8aa9e";
> > ubyte[] nihongo3 = [0xe6, 0x97, 0xa5, 0xe6, 0x9c, 0xac, 0xe8,
> > 0xaa, 0x9e];
> 
> You should use unicode directly here, that's the whole point to
> support it.
> string nihongo = "日本語";

It's a nice feature, but there are plenty of cases where it makes more sense 
to use the unicode values rather than the characters themselves (e.g. your 
keyboard doesn't have the characters in question). It's valuable to be able to 
do it both ways.

- Jonathan M Davis


Re: Shared keyword and the GC?

2012-10-18 Thread Sean Kelly
On Oct 17, 2012, at 1:55 AM, Alex Rønne Petersen  wrote:
> 
> So, let's look at D:
> 
> 1. We have global variables.
> 1. Only std.concurrency enforces isolation at a type system level; it's not 
> built into the language, so the GC cannot make assumptions.
> 1. The shared qualifier effectively allows pointers from one thread's heap 
> into another's.

Well, the problem is more that a variable can be cast to shared after 
instantiation, so to allow thread-local collections we'd have to make 
cast(shared) set a flag on the memory block to indicate that it's shared, and 
vice-versa for unshared.  Then when a thread terminates, all blocks not flagged 
as shared would be finalized, leaving the shared blocks alone.  Then any pool 
from the terminated thread containing a shared block would have to be merged 
into the global heap instead of released to the OS.

I think we need to head in this direction anyway, because we need to make sure 
that thread-local data is finalized by its owner thread.  A blocks owner would 
be whoever allocated the block or if cast to shared and back to unshared, 
whichever thread most recently cast the block back to unshared.  Tracking the 
owner of a block gives us the shared state implicitly, making thread-local 
collections possible.  Who wants to work on this? :-)

Re: Shared keyword and the GC?

2012-10-18 Thread Jacob Carlborg

On 2012-10-18 20:26, Sean Kelly wrote:


Well, the problem is more that a variable can be cast to shared after 
instantiation, so to allow thread-local collections we'd have to make 
cast(shared) set a flag on the memory block to indicate that it's shared, and 
vice-versa for unshared.  Then when a thread terminates, all blocks not flagged 
as shared would be finalized, leaving the shared blocks alone.  Then any pool 
from the terminated thread containing a shared block would have to be merged 
into the global heap instead of released to the OS.


Or move the shared data to the global heap when it's casted. Don't know 
that's best. This way all data in a give pool will be truly thread local.


--
/Jacob Carlborg


Re: make install; where do .di files go?

2012-10-18 Thread Jacob Carlborg

On 2012-10-18 14:24, Manu wrote:


What's distinct about the D calling convention?


GDC uses the C calling convention (or whatever the calling convention 
used by the system) where DMD uses a slightly modified version, if I 
recall correctly. Note that DMD only defines an ABI for x86 and possible 
x86-64.


http://dlang.org/abi.html

--
/Jacob Carlborg


Re: Shared keyword and the GC?

2012-10-18 Thread Sean Kelly
On Oct 18, 2012, at 11:48 AM, Jacob Carlborg  wrote:

> On 2012-10-18 20:26, Sean Kelly wrote:
> 
>> Well, the problem is more that a variable can be cast to shared after 
>> instantiation, so to allow thread-local collections we'd have to make 
>> cast(shared) set a flag on the memory block to indicate that it's shared, 
>> and vice-versa for unshared.  Then when a thread terminates, all blocks not 
>> flagged as shared would be finalized, leaving the shared blocks alone.  Then 
>> any pool from the terminated thread containing a shared block would have to 
>> be merged into the global heap instead of released to the OS.
> 
> Or move the shared data to the global heap when it's casted. Don't know 
> that's best. This way all data in a give pool will be truly thread local.

And back down to a local pool when shared is cast away.  Assuming the block is 
even movable.  I agree that this would be the most efficient use of memory, but 
I don't know that it's feasible.

Re: make install; where do .di files go?

2012-10-18 Thread Sean Kelly
On Oct 18, 2012, at 11:45 AM, Jacob Carlborg  wrote:

> On 2012-10-18 14:24, Manu wrote:
> 
>> What's distinct about the D calling convention?
> 
> GDC uses the C calling convention (or whatever the calling convention used by 
> the system) where DMD uses a slightly modified version, if I recall 
> correctly. Note that DMD only defines an ABI for x86 and possible x86-64.

On x86_32, the first argument is passed in EAX with the D calling convention.  
But DMD is the only compiler that does this.

Re: make install; where do .di files go?

2012-10-18 Thread Iain Buclaw
On 18 October 2012 19:45, Jacob Carlborg  wrote:
> On 2012-10-18 14:24, Manu wrote:
>
>> What's distinct about the D calling convention?
>
>
> GDC uses the C calling convention (or whatever the calling convention used
> by the system) where DMD uses a slightly modified version, if I recall
> correctly. Note that DMD only defines an ABI for x86 and possible x86-64.
>
> http://dlang.org/abi.html
>
> --
> /Jacob Carlborg

Yep, and as I've repeatedly pointed out, this calling convention is
only defined (in the specs) for Win32 targets.  However the DMD
compiler appears to infact apply this to all platforms running on x86.



-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


Re: Regarding hex strings

2012-10-18 Thread Kagamin
Your keyboard doesn't have ready unicode values for all 
characters either.


Re: Shared keyword and the GC?

2012-10-18 Thread Jacob Carlborg

On 2012-10-18 20:54, Sean Kelly wrote:


And back down to a local pool when shared is cast away.  Assuming the block is 
even movable.  I agree that this would be the most efficient use of memory, but 
I don't know that it's feasible.


You said the thread local heap would be merged with the global on thread 
termination. How is that different?


Alternative it could stay in the global heap. I mean, not many variables 
should be "shared" and even fewer should be casted back and forth.


--
/Jacob Carlborg


Re: 48 hour game jam

2012-10-18 Thread Jacob Carlborg

On 2012-10-17 16:14, Manu wrote:


Okay, awesome. Cheers.
I got it running on Linux-x64 last night, so I think that's the only
thing holding OSX back.


Ok, I have a fully working window, which behaves properly and receives 
events. In addition to that I have an application menu with the standard 
items and a dock icon. This is all running completely separate.


No I would need some help to integrate this back in the game engine.

This is the latest error I have when compiling:

In file included from ../Source/MFFont.cpp:5:
In file included from ../Source/MFTexture_Internal.h:7:
In file included from ../Source/../Source/Drivers/OpenGL/MFOpenGL.h:46:
../Source/../Source/Drivers/OpenGL/glew/glxew.h:97:10: fatal error: 
'X11/Xlib.h' file not found

#include 
 ^
12 warnings and 1 error generated.
make[1]: *** [../Build/Release/Fuji/MFFont.o] Error 1
make: *** [Fuji] Error 2

Either "glxew.h" shouldn't be used or it's need to have a Mac OS X version.

--
/Jacob Carlborg


Calling conventions (was: make install; where do .di files go?)

2012-10-18 Thread David Nadlinger
On Thursday, 18 October 2012 at 18:45:29 UTC, Jacob Carlborg 
wrote:

On 2012-10-18 14:24, Manu wrote:


What's distinct about the D calling convention?


GDC uses the C calling convention (or whatever the calling 
convention used by the system)


There is no single »C calling convention«, it varies between 
different OSes and architectures. But yes, as far as I know the 
calling convention used by GDC for extern(D) is the same as GCC 
(the C compiler) defaults to.


where DMD uses a slightly modified version, if I recall 
correctly. Note that DMD only defines an ABI for x86 and 
possible x86-64.


http://dlang.org/abi.html


The situation on x86_64 is actually quite different from the one 
on x86: On x86, DMD uses a pretty unique calling convention, 
which for example passes the first/last integer parameter in EAX. 
This calling convention is usually very cubmersome to support in 
alternative compilers – for example, while calling conventions 
are easily extensible in LLVM, it still isn't a small task, and 
requires directly patching LLVM. It is unlikely that this calling 
convention will ever be supported by an alternative compiler, 
simply because there is little motivation for somebody to step up 
and implement it.


On x86_64, however, the »The extern (C) and extern (D) calling 
convention matches the C calling convention used by the supported 
C compiler on the host system« clause from the ABI documentation 
applies and DMD also tries to follow it. However, what DMD 
produces unfortunately doesn't exactly match the System V x86_64 
ABI – for example, it passes the parameters in reverse order, 
for reasons unknown (if it's just a hack to make it seem as if 
LTR parameter evaluation was implemented, then its … well a 
huge hack with considerable negative consequences).


Actually, Walter, could you please clarify the reasons for this 
deviation from the spec? While LDC currently follows DMD [1] GDC 
doesn't; and when I'm going to overhaul the LDC x86_64 ABI 
implementation soon, it would be nice to just remove the hack and 
make LDC actually conform to the D (!) spec.


Which brings me to my next point: Currently, the extern(D) 
calling convention used by LDC on x86_64 is neither the DMD one 
nor the C one, but a custom design. It is back from when LDC was 
the first "stable" compiler for x86_64, and I think the intention 
of Frits and the others back then was to make it similar to the 
DMD x86 ABI implementation (the D). I do intend to make extern(D) 
behave the same as extern(C), as mandated by the spec, but I'd 
like to know whether the DMD argument order is going to stay 
reversed to avoid breaking the ABI twice.


So much for the »low-level« parameter-passing part of the 
ABI/calling convention. There are several more issues regarding 
cross-compiler compatibility which should also not be overlooked. 
For example, the layout of nested contexts (for closures, nested 
structs, …) is currently completely unspecified, and as a 
result differs between DMD and LDC (and GDC probably as well). 
Also, the runtime interface has slight differences, although 
these should be fairly easy to reconcile. Exception handling is 
another difference…


All those differences should definitely be fixed, at least for 
x86_64 and future platforms like ARM, because an unified ABI is 
definitely a very good thing to have – C++ continues to suffer 
dearly for not specifying one. But this is only going to happen 
if all three compiler implementations are actively working 
together.


David


[1] By simply reversing all the parameters in both LLVM function 
declarations and calls – duh!


Re: Calling conventions (was: make install; where do .di files go?)

2012-10-18 Thread Iain Buclaw
On Thursday, 18 October 2012 at 19:56:24 UTC, David Nadlinger 
wrote:


All those differences should definitely be fixed, at least for 
x86_64 and future platforms like ARM, because an unified ABI is 
definitely a very good thing to have – C++ continues to 
suffer dearly for not specifying one. But this is only going to 
happen if all three compiler implementations are actively 
working together.




And I do firmly believe that 2 out of 3 are starting to tie some 
knots together to begin doing this. :-)



Thanks,
Iain.


Re: 48 hour game jam

2012-10-18 Thread Manu
On 18 October 2012 22:26, Jacob Carlborg  wrote:

> On 2012-10-17 16:14, Manu wrote:
>
>  Okay, awesome. Cheers.
>> I got it running on Linux-x64 last night, so I think that's the only
>> thing holding OSX back.
>>
>
> Ok, I have a fully working window, which behaves properly and receives
> events. In addition to that I have an application menu with the standard
> items and a dock icon. This is all running completely separate.
>
> No I would need some help to integrate this back in the game engine.
>
> This is the latest error I have when compiling:
>
> In file included from ../Source/MFFont.cpp:5:
> In file included from ../Source/MFTexture_Internal.**h:7:
> In file included from ../Source/../Source/Drivers/**OpenGL/MFOpenGL.h:46:
> ../Source/../Source/Drivers/**OpenGL/glew/glxew.h:97:10: fatal error:
> 'X11/Xlib.h' file not found
> #include 
>  ^
> 12 warnings and 1 error generated.
> make[1]: *** [../Build/Release/Fuji/MFFont.**o] Error 1
> make: *** [Fuji] Error 2
>
> Either "glxew.h" shouldn't be used or it's need to have a Mac OS X version.


Ah yes, what do the OSX OpenGL libs look like? GLX is only a very thin
front end on a fairly conventional OpenGL. It's only a couple of functions
that would be replaced by some mac variant I expect.
I'll come on IRC in 5 minutes or so.


Re: Regarding hex strings

2012-10-18 Thread Jonathan M Davis
On Thursday, October 18, 2012 21:09:14 Kagamin wrote:
> Your keyboard doesn't have ready unicode values for all
> characters either.

So? That doesn't make it so that it's not valuable to be able to input the 
values in hexidecimal instead of as actual unicode characters. Heck, if you 
want a specific character, I wouldn't trust copying the characters anyway, 
because it's far too easy to have two characters which look really similar but 
are different (e.g. there are multiple types of angle brackets in unicode), 
whereas with the numbers you can be sure. And with some characters (e.g. 
unicode whitespace characters), it generally doesn't make sense to enter the 
characters directly.

Regardless, my point is that both approaches can be useful, so it's good to be 
able to do both. If you prefer to put the unicode characters in directly, then 
do that, but others may prefer the other way. Personally, I've done both.

- Jonathan M Davis


Re: Shared keyword and the GC?

2012-10-18 Thread Sean Kelly
On Oct 18, 2012, at 12:22 PM, Jacob Carlborg  wrote:

> On 2012-10-18 20:54, Sean Kelly wrote:
> 
>> And back down to a local pool when shared is cast away.  Assuming the block 
>> is even movable.  I agree that this would be the most efficient use of 
>> memory, but I don't know that it's feasible.
> 
> You said the thread local heap would be merged with the global on thread 
> termination. How is that different?
> 
> Alternative it could stay in the global heap. I mean, not many variables 
> should be "shared" and even fewer should be casted back and forth.

It's different in that a variable's address never actually changes.  When a 
thread completes it hands all of its pools to the shared allocator, and then 
per-thread allocators request free pools from the shared allocator before going 
to the OS.  This is basically how the HOARD allocator works.

Re: 48 hour game jam

2012-10-18 Thread Jacob Carlborg

On Thursday, 18 October 2012 at 20:11:56 UTC, Manu wrote:

Ah yes, what do the OSX OpenGL libs look like? GLX is only a 
very thin
front end on a fairly conventional OpenGL. It's only a couple 
of functions

that would be replaced by some mac variant I expect.


Mac OS X has the OpenGL framework:

https://developer.apple.com/library/mac/#documentation/GraphicsImaging/Reference/CGL_OpenGL/Reference/reference.html#//apple_ref/doc/uid/TP40001186

And a couple of high level Objective-C classes. This is the 
programming guides for OpenGL on Mac OS X:


https://developer.apple.com/library/mac/#documentation/GraphicsImaging/Conceptual/OpenGL-MacProgGuide/opengl_intro/opengl_intro.html


I'll come on IRC in 5 minutes or so.


I won't be online tonight, it's getting late here. Tomorrow or 
perhaps saturday.


--
/Jacob Carlborg


What about std.lockfree ?

2012-10-18 Thread denizzzka
Anyone interested in std.lockfree - lock-free lists, FIFOs, 
stacks etc?


I spent a few days doing implementations of procs, but has not 
reached any success.


My plan:

For many of lock-free algorithms it is need a function MCAS 
(multiple compare-and-swap) also called CASN (cas for n 
elements). In fact, it is looks very easy to maintain a 
doubly-linked lists or a trees or graphs if you can at the same 
time to change (or not change) all links of one or more of its 
elements. (But do not forget about ABA problem.)


But:

0. MCAS/CASN and RDCSS algorithms at first seem looks like brain 
damaging mocking puzzles


1. It is forbidden to use unproven algorithms - otherwise there 
is a risk that the algorithm will falls sometimes and find them 
will be difficult. This simplifies matters: just copy and paste 
ready procs from articles!


2 Almost everywhere in these algorithms need a function CAS1
 - proposed function core.atomic: casw 
(http://d.puremagic.com/issues/show_bug.cgi?id=8831#c4)
 - on casw basis CAS1 can be implemented easily (line 136: 
https://github.com/denizzzka/casn/blob/75b0377aaa1424f3bd3fa3d47eddf4b5fd4e8038/casn.d)


3. I could not run a well-known algorithm RDCSS: it falls on line 
198 (Complete() function):


if(isDescriptor(r)) Complete(r);

I am understand why it falls - at the time of call Complete(r) 
pointer r can point to garbage because descriptor can be already 
removed by another thread. But I do not understand why this 
algorithm works in other places.


RDCSS described in a article "A Practical Multi-Word 
Compare-and-Swap Operation", explanation also can be found here: 
http://cstheory.stackexchange.com/questions/7083/a-practical-multi-word-compare-and-swap-operation


But it does not matter, because RDCSS algorithm has one major 
flaw - it uses two least significant bits as flags (perhaps the 
number of flags can be reduced to one) that indicate the type of 
information transmitted. It is bad dirty hack and, as a result, 
RDCSS can be used only for the exchange of pointers to aligned 
data, which is not always acceptable.


It is available another procedure called GCAS. 
(http://lampwww.epfl.ch/~prokopec/ctries-snapshot.pdf)
This procedure has semantics similar to that of the RDCSS. But I 
have not examined closely. I have plan to try it later.


This all is a very complex, right? Someone else interested on it? 
You see here the error in reasoning? Can someone help or has 
finished implementation of these algorithms?


It would be great if std.lockfree will be created, because it 
will reveal many benefits of D associated with shared variables. 
As I know Java and C# already have this.




Re: What about std.lockfree ?

2012-10-18 Thread denizzzka

(RDCSS is a part of MCAS)


Re: 48 hour game jam

2012-10-18 Thread F i L

Trying to build in Linux, but having problems.

I follow the steps from github wiki "How to build under Windows", 
except I run 'Fuji/create_project.sh' instead of '.bat'... now 
I'm a bit confused as to what steps to take. Running 'Fuji/make' 
has errors, and running 'Stache/make_project.sh' -> 'make' gives 
me:


make[1]: *** No targets.  Stop.
make: *** [Stache] Error 2

which I assume is because Fuji isn't built (?). Help please!

Nice screenshot, btw :)


Re: Const ref and rvalues again...

2012-10-18 Thread Malte Skarupke

On Thursday, 18 October 2012 at 06:11:26 UTC, monarch_dodra wrote:
On Thursday, 18 October 2012 at 04:30:17 UTC, Jonathan M Davis 
wrote:

On Thursday, October 18, 2012 06:24:08 jerro wrote:

What would be the problem with const ref taking rvalues?


Read the thread that I already linked to:

http://forum.dlang.org/thread/4f84d6dd.5090...@digitalmars.com

- Jonathan M Davis


I read the thread, and not a single one of the "problematic 
cases" are actually valid C++.


Yes: the faulty MSVC has taught people to do retarded things, 
or be afraid of things that were illegal to begin with (in 
particular, pass an rvalue to a ref, WHICH IS ILLEGAL IN C++), 
such as "increment(5)".


There is actually nothing wrong with creating a temporary when 
something is bound to a const ref, provided the compiler 
follows the rules:


*Only LValues with an EXACT type match may be passed to a 
reference.
*In regards to *const* references, RValues may be copied in a 
temporary, and that temporary bound the the ref.


I'm not saying we particularly *need* this in D (C++ has a "by 
ref" paradigm that makes it more important, but D *rarelly* 
ever passes by const ref).


But if the compiler respects the above two rules (which it 
should), then RValue to const ref is both perfectly doable and 
safe (as safe as refs get anyways).


The problem with binding rvalues to const ref is that you could 
take and store the address of it. That's why I'd recommend using 
"in ref" instead.


@Jonathan: I had already read the linked discussion. There are 
many valid points in there, but also many invalid ones (as 
monarch_dodra has pointed out). But I think all problems in that 
thread should be solved by using "in ref" instead of "const ref" 
because then you'd be sure that the passed-in temporary can not 
escape the current function.


@foobar: I like the idea, but it's probably going to break down 
in many cases. If you have a non-trivial copy constructor you 
want the ability to have complete control over when it gets 
copied and when it doesn't. I just don't trust compilers enough 
to think that they'd always make the same choice that I'd make.
And also about losing semantic information: That's why I proposed 
the second point: Give the user the option to provide a function 
which should be preferred for rvalues. That way you don't lose 
semantic information.


Re: Regarding hex strings

2012-10-18 Thread Nick Sabalausky
On Wed, 17 Oct 2012 19:49:43 -0700
"H. S. Teoh"  wrote:

> On Thu, Oct 18, 2012 at 02:45:10AM +0200, bearophile wrote:
> [...]
> > hex strings are useful, but I think they were invented in D1 when
> > strings were convertible to char[]. But today they are an array of
> > immutable UFT-8, so I think this default type is not so useful:
> > 
> > void main() {
> > string data1 = x"A1 B2 C3 D4"; // OK
> > immutable(ubyte)[] data2 = x"A1 B2 C3 D4"; // error
> > }
> > 
> > 
> > test.d(3): Error: cannot implicitly convert expression
> > ("\xa1\xb2\xc3\xd4") of type string to ubyte[]
> [...]
> 
> Yeah I think hex strings would be better as ubyte[] by default.
> 
> More generally, though, I think *both* of the above lines should be
> equally accepted.  If you write x"A1 B2 C3" in the context of
> initializing a string, then the compiler should infer the type of the
> literal as string, and if the same literal occurs in the context of,
> say, passing a ubyte[], then its type should be inferred as ubyte[],
> NOT string.
> 

Big +1

Having the language expect x"..." to always be a string (let alone a
*valid UTF* string) is just insane. It's just too damn useful for
arbitrary binary data.




Re: Const ref and rvalues again...

2012-10-18 Thread jerro
The problem with binding rvalues to const ref is that you could 
take and store the address of it. That's why I'd recommend 
using "in ref" instead.


You can also take and store the address of a local variable that 
was passed as a const ref parameter, and accessing it after the 
caller exits will result in undefined behavior too. On the other 
hand,  addresses of both local variables and rvalues will be 
valid at least until the called function returns. Local variables 
and rvalues are equivalent in regard to this problem.




Re: Const ref and rvalues again...

2012-10-18 Thread Timon Gehr

On 10/19/2012 01:39 AM, Malte Skarupke wrote:

On Thursday, 18 October 2012 at 06:11:26 UTC, monarch_dodra wrote:

On Thursday, 18 October 2012 at 04:30:17 UTC, Jonathan M Davis wrote:

On Thursday, October 18, 2012 06:24:08 jerro wrote:

What would be the problem with const ref taking rvalues?


Read the thread that I already linked to:

http://forum.dlang.org/thread/4f84d6dd.5090...@digitalmars.com

- Jonathan M Davis


I read the thread, and not a single one of the "problematic cases" are
actually valid C++.

Yes: the faulty MSVC has taught people to do retarded things, or be
afraid of things that were illegal to begin with (in particular, pass
an rvalue to a ref, WHICH IS ILLEGAL IN C++), such as "increment(5)".

There is actually nothing wrong with creating a temporary when
something is bound to a const ref, provided the compiler follows the
rules:

*Only LValues with an EXACT type match may be passed to a reference.
*In regards to *const* references, RValues may be copied in a
temporary, and that temporary bound the the ref.

I'm not saying we particularly *need* this in D (C++ has a "by ref"
paradigm that makes it more important, but D *rarelly* ever passes by
const ref).

But if the compiler respects the above two rules (which it should),
then RValue to const ref is both perfectly doable and safe (as safe as
refs get anyways).


The problem with binding rvalues to const ref is that you could take and
store the address of it. That's why I'd recommend using "in ref" instead.

@Jonathan: I had already read the linked discussion. There are many
valid points in there, but also many invalid ones (as monarch_dodra has
pointed out). But I think all problems in that thread should be solved
by using "in ref" instead of "const ref" because then you'd be sure that
the passed-in temporary can not escape the current function.

@foobar: I like the idea, but it's probably going to break down in many
cases. If you have a non-trivial copy constructor you want the ability
to have complete control over when it gets copied and when it doesn't. I
just don't trust compilers enough to think that they'd always make the
same choice that I'd make.
And also about losing semantic information: That's why I proposed the
second point: Give the user the option to provide a function which
should be preferred for rvalues. That way you don't lose semantic
information.


Const is different in D and in C++. Relating const and rvalues is 
arbitrary and does not make a lot of sense.


Regarding 'in ref'/'scope ref': What should 'scope' apply to in

void foo(scope ref int* x);



Re: Regarding hex strings

2012-10-18 Thread bearophile

Nick Sabalausky:


Big +1

Having the language expect x"..." to always be a string (let 
alone a *valid UTF* string) is just insane. It's just too

damn useful for arbitrary binary data.


I'd like an opinion on such topics from one of the the D bosses 
:-)


Bye,
bearophile


Re: Regarding hex strings

2012-10-18 Thread Nick Sabalausky
On Thu, 18 Oct 2012 12:11:13 +0200
"foobar"  wrote:
> 
> How often large binary blobs are literally spelled in the source 
> code (as opposed to just being read from a file)?


Frequency isn't the issue. The issues are "*Is* it ever needed?" and
"When it is needed, is it useful enough?" The answer to both is most
certainly "yes". (Remember, D is supposed to usable as a systems
language, it's not merely a high-level-app-only language.)

Keep in mind, the question "Does it pull it's own weight?" is for
adding new features, not for going around gutting the language
just because we can.

> In any case, I'm not opposed to such a utility library, in fact I 
> think it's a rather good idea and we already have a precedent 
> with "oct!"
> I just don't think this belongs as a built-in feature in the 
> language.

I think monarch_dodra's test proves that it definitely needs to be
built-in.



Re: Shared keyword and the GC?

2012-10-18 Thread Michel Fortin

On 2012-10-18 18:26:08 +, Sean Kelly  said:


Well, the problem is more that a variable can be cast to shared after
instantiation, so to allow thread-local collections we'd have to make
cast(shared) set a flag on the memory block to indicate that it's
shared, and vice-versa for unshared.  Then when a thread terminates, all
blocks not flagged as shared would be finalized, leaving the shared
blocks alone.  Then any pool from the terminated thread containing a
shared block would have to be merged into the global heap instead of
released to the OS.

I think we need to head in this direction anyway, because we need to
make sure that thread-local data is finalized by its owner thread.  A
blocks owner would be whoever allocated the block or if cast to shared
and back to unshared, whichever thread most recently cast the block back
to unshared.  Tracking the owner of a block gives us the shared state
implicitly, making thread-local collections possible.  Who wants to work
on this? :-)


All this is nice, but what is the owner thread for immutable data? 
Because immutable is always implicitly shared, all your strings and 
everything else that is immutable is thus "shared" and must be tracked 
by the global heap's collector and can never be handled by a 
thread-local collector. Even if most immutable data never leaves the 
thread it was allocated in, there's no way you can know.


I don't think per-thread GCs will work very well without support for 
immutable data, an for that you need to have a distinction between 
immutable and shared immutable (just like you have with mutable data). 
I complained about this almost three years ago when the semantics of 
shared were being defined, but it got nowhere. Quoting Walter at the 
time:


As for a shared gc vs thread local gc, I just see an awful lot of 
strange irreproducible bugs when someone passes data from one to the 
other. I doubt it's worth it, unless it can be done with compiler 
guarantees, which seem doubtful.


I think you'll have a hard time convincing Walter it is worth changing 
the behaviour of type modifiers at this point.


Reference:



--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/



Re: Const ref and rvalues again...

2012-10-18 Thread Malte Skarupke

On Friday, 19 October 2012 at 00:03:49 UTC, Timon Gehr wrote:


Const is different in D and in C++. Relating const and rvalues 
is arbitrary and does not make a lot of sense.


Regarding 'in ref'/'scope ref': What should 'scope' apply to in

void foo(scope ref int* x);


Not sure what you mean with "relating." I'm not making any claims 
about there being a relationship between rvalues and constness.


This is about finding a way that you can define a function which 
safely accepts lvalues and rvalues without having to make a copy. 
If we specify the argument as "ref in", then we can safely pass 
for example the number 5 to it. And this would never break 
existing code, so that something like swap(5, 4) would never be 
possible code.


For the example that you gave you'd be unable to store the 
address of x. So doing


int** storage;
void foo(scope ref int * x)
{
storage = &x;
}

would be illegal.

@jerro: the same thing: I'm not trying to fix the problem that 
you mention. I'm trying to define a function which can safely 
accept rvalues and lvalues without having to make a copy.


Re: Tricky semantics of ranges & potentially numerous Phobos bugs

2012-10-18 Thread Marco Leise
Am Wed, 17 Oct 2012 22:09:08 +0200
schrieb "monarch_dodra" :

> Given that "byLine" already exists, I'm not sure we can change it 
> now. But I wouldn't be against adding a "byLineSlow" or something.
> 
> However, if we could start again, I'd *definitely* favor a deep 
> copying "byLine" by default, and have a faster, but harder to use 
> "byLineFast".

I agree. And simple demo programs can just use

byLine => string

and if we talk about a fast "word count" demo, then it
probably doesn't hurt when the reader sees, that the library
provides ranges for both use cases.

byLineOverwrite => char[]

After all a line is expected to be a string, and D to be safe.
But the real issue are the differing views on how .front
should work. Unlike other problems, this one has solutions
that wont break code, if that is a requirement. So I'll let
the Phobos crew argue and see what happens. :)

-- 
Marco



Re: More range woes: composed ranges are unsafe to return from functions

2012-10-18 Thread Marco Leise
Am Tue, 16 Oct 2012 17:28:47 -0700
schrieb "H. S. Teoh" :

> On Tue, Oct 16, 2012 at 09:47:36PM +0200, jerro wrote:
> > >Hmm. There *is* a delegate being passed to map(). Would that cause
> > >problems? Theoretically it shouldn't, but as you said, if dmd
> > >isn't
> > >handling it correctly that could cause problems.
> > 
> > I'm looking at the disassembly of cprod
> > (http://pastebin.com/ngTax6B8) and there doesn't seem to be a call
> > to _d_allocmemory in it. AFAIK it should be if the memory for the
> > variables that the delegate uses was allocated on the heap?
> 
> Filed bug:
> 
>   http://d.puremagic.com/issues/show_bug.cgi?id=8832
> 
> Whew, what a day! Two compiler bugs, no less, and a whole bunch of
> Phobos issues. I think I may need to take a break from D for a day or
> two. :-/
> 
> 
> T
> 

And that's where all the good projects end... :D

-- 
Marco



Re: Const ref and rvalues again...

2012-10-18 Thread Timon Gehr

On 10/19/2012 03:26 AM, Malte Skarupke wrote:

On Friday, 19 October 2012 at 00:03:49 UTC, Timon Gehr wrote:


Const is different in D and in C++. Relating const and rvalues is
arbitrary and does not make a lot of sense.

Regarding 'in ref'/'scope ref': What should 'scope' apply to in

void foo(scope ref int* x);


Not sure what you mean with "relating." I'm not making any claims about
there being a relationship between rvalues and constness.



You do it again right away:


This is about finding a way that you can define a function which safely
accepts lvalues and rvalues without having to make a copy. If we specify
the argument as "ref in", then we can safely pass for example the number

  ^~~~
const

5 to it.

  ^
rvalue


And this would never break existing code, so that something
like swap(5, 4) would never be possible code.



That does not break existing code except code that checks validity of
code, but every language change does that.


For the example that you gave you'd be unable to store the address of x.
So doing

int** storage;
void foo(scope ref int * x)
{
 storage = &x;
}

would be illegal.



Then how to specify that the value of x cannot be escaped?
I'm in favour of doing it the other way round and disallow escaping of
ref parameters without an unsafe cast.


@jerro: the same thing: I'm not trying to fix the problem that you
mention. I'm trying to define a function which can safely accept rvalues
and lvalues without having to make a copy.




Re: Regarding hex strings

2012-10-18 Thread Marco Leise
Am Thu, 18 Oct 2012 16:31:57 +0200
schrieb "monarch_dodra" :

> On Thursday, 18 October 2012 at 13:15:55 UTC, bearophile wrote:
> > monarch_dodra:
> >
> >> hex! was a very good idea actually, imo.
> >
> > It must scale up to "real world" usages. Try it with a program
> > composed of 3 modules each one containing a 100 KB long string.
> > Then try it with a program with two hundred of medium sized
> > literals, and let's see compilation times and binary sizes.
> >
> > Bye,
> > bearophile
> 
> Hum... The compilation is pretty fast actually, about 1 second, 
> provided it doesn't choke.
> 
> It works for strings up to a length of 400 lines @ 80 chars per 
> line, which result to approximately 16K of data. After that, I 
> get a DMD out of memory error.
> 
> DMD memory usage spikes quite quickly. To compile those 400 lines 
> (16K), I use 800MB of memory (!). If I reach about 1GB, then it 
> crashes.
> 
> I tried using a refAppender instead of ret~, but that changed 
> nothing.
> 
> Kind of weird it would use that much memory though...
> 
> Also, the memory doesn't get released. I can parse a 1x400 Line 
> string, but if I try to parse 3 of them, DMD will choke on the 
> second one. :(

Hehe, I assume most of the regulars know this: DMD used to
use a garbage collector that is disabled. Memory just isn't
freed! Also it has copy on write semantics during CTFE:

int bug6498(int x)
{
int n = 0;
while (n < x)
++n;
return n;
}
static assert(bug6498(10_000_000)==10_000_000);

--> Fails with an 'out of memory' error.

http://d.puremagic.com/issues/show_bug.cgi?id=6498

So, as strange as it sounds, for now try not to write often or
into large blocks. Using this knowledge I was sometimes able
to bring down the memory consumption considerably by caching
recurring concatenations of two strings or to!string calls.

That said, appending single elements to an array may actually
be better than using a fixed-sized one and have DMD duplicate
it on every write. :p

Please remember to give Don a cookie when he manages to change
the compiler to modify in-place where appropriate.

-- 
Marco



Re: Const ref and rvalues again...

2012-10-18 Thread jerro
@jerro: the same thing: I'm not trying to fix the problem that 
you mention. I'm trying to define a function which can safely 
accept rvalues and lvalues without having to make a copy.


My point was that saving an address of a const ref parameter is 
already unsafe if you call the function with a local variable as 
the parameter. If this behavior seems problematic to you when it 
concerns rvalues, it should seem equally problematic when it 
comes to local variables. It doesn't make sense to make passing 
rvalues as const ref parameters illegal because of this problem, 
when passing local variables causes the same problem and is 
legal. It would only make sense to introduce "in ref" whith an 
intent to solve this problem, if local variables would also have 
to be passed as "in ref" (or some kind of scope ref, not 
necessarily const). But that would break pretty much all code 
that uses ref.


The only case I can think of when passing a local variable as 
const ref is safe, but passing an rvalue wouldn't be, is when the 
called function returns the address of the const parameter (or 
assigns it to some other ref parameter).


Re: What about std.lockfree ?

2012-10-18 Thread denizzzka

"falls"? I mean fails, of course :-) sorry for my English


Re: Regarding hex strings

2012-10-18 Thread Jonathan M Davis
On Friday, October 19, 2012 05:14:44 Marco Leise wrote:
> Hehe, I assume most of the regulars know this: DMD used to
> use a garbage collector that is disabled.

Yes, but it didn't use it for long, because it made performance worse, and 
Walter didn't have the time to spend fixing it, so it was disabled. Presumably, 
someone will take the time to improve it at some point and then it will be re-
enabled.

> Memory just isn't freed!

That was my understanding, but the last time that I said that, Brad Roberts 
said that it wasn't true, and that we should stop spreading that FUD, so I 
don't know what the exact situation is, but it sounds like if that was true in 
the past, it's not true now. Regardless, it's clear that dmd still uses too 
much memory in many cases, especially when code uses a lot of templates or 
CTFE.

- Jonathan M Davis


Re: More range woes: composed ranges are unsafe to return from functions

2012-10-18 Thread H. S. Teoh
On Fri, Oct 19, 2012 at 04:24:53AM +0200, Marco Leise wrote:
> Am Tue, 16 Oct 2012 17:28:47 -0700
> schrieb "H. S. Teoh" :
> 
> > On Tue, Oct 16, 2012 at 09:47:36PM +0200, jerro wrote:
> > > >Hmm. There *is* a delegate being passed to map(). Would that
> > > >cause problems? Theoretically it shouldn't, but as you said, if
> > > >dmd isn't handling it correctly that could cause problems.
> > > 
> > > I'm looking at the disassembly of cprod
> > > (http://pastebin.com/ngTax6B8) and there doesn't seem to be a call
> > > to _d_allocmemory in it. AFAIK it should be if the memory for the
> > > variables that the delegate uses was allocated on the heap?
> > 
> > Filed bug:
> > 
> > http://d.puremagic.com/issues/show_bug.cgi?id=8832
> > 
> > Whew, what a day! Two compiler bugs, no less, and a whole bunch of
> > Phobos issues. I think I may need to take a break from D for a day
> > or two. :-/
[...]
> And that's where all the good projects end... :D
[...]

Actually, I just went back to working on my personal D project for a
bit. I was a bit disappointed that what I thought would be a quick
side-job (implement cartesianProduct in std.algorithm) turned out to get
stymied by compiler bugs and Phobos issues.

I have to say, though, that in spite of all these problems with the
current implementation of D, it is still pretty dang powerful, and I
would still never go back to C++ again (for my personal projects,
anyway). I have managed to implement in ~2 weeks the vector computation
part of my geometric computation project (that took a whole lot longer
to write in C++ many years ago), and with much cleaner code too. The D
implementation has already far exceeded the original implementation,
both in terms of functionality, and in terms of code cleanliness.

It's just all the little things that D did right: delegates that
simplified the list-processing operator implementation greatly;
templates that allowed me to use the same code to both parse and build
an expression tree or evaluate it on-the-fly (just by passing an
appropriately-crafted subclass to the template); ranges that allow
generic code instead of writing 20 variants of what is essentially the
same code, one for each incompatible container type, etc..
Functional-style code like non-trivial combinations of map and reduce,
which are great for simplifying complex code to just a couple o' lines
-- which are very painful to write in C++ and even harder to debug.

So yes, D still has a ways to go, and it does have its warts, but it's
heaven compared to C++.


T

-- 
Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry


Re: Regarding hex strings

2012-10-18 Thread Marco Leise
Am Thu, 18 Oct 2012 21:03:01 -0700
schrieb Jonathan M Davis :

> On Friday, October 19, 2012 05:14:44 Marco Leise wrote:
> > Memory just isn't freed!
> 
> That was my understanding, but the last time that I said that, Brad Roberts 
> said that it wasn't true, and that we should stop spreading that FUD, so I 
> don't know what the exact situation is, but it sounds like if that was true 
> in 
> the past, it's not true now. Regardless, it's clear that dmd still uses too 
> much memory in many cases, especially when code uses a lot of templates or 
> CTFE.
> 
> - Jonathan M Davis

He called it a FUD? Without trying to sound too patronizing, most D
programmers would really only notice DMD's memory footprint
when they use CTFE features. It is always Pegged, ctRegex, etc.
that make the issue come up, never basic code. And preloading
the Boehm collector showed that gigabytes of CTFE memory usage
can still be brought down to a few hundred MB [citation
needed]. I guess we can meet somewhere in the middle. Btw. did
I mix up Don and Brad in the last post ? Who is working on the
memory management ?

-- 
Marco



Re: Regarding hex strings

2012-10-18 Thread Jonathan M Davis
On Friday, October 19, 2012 07:29:46 Marco Leise wrote:
> Am Thu, 18 Oct 2012 21:03:01 -0700
> 
> schrieb Jonathan M Davis :
> > On Friday, October 19, 2012 05:14:44 Marco Leise wrote:
> > > Memory just isn't freed!
> > 
> > That was my understanding, but the last time that I said that, Brad
> > Roberts
> > said that it wasn't true, and that we should stop spreading that FUD, so I
> > don't know what the exact situation is, but it sounds like if that was
> > true in the past, it's not true now. Regardless, it's clear that dmd
> > still uses too much memory in many cases, especially when code uses a lot
> > of templates or CTFE.
> > 
> > - Jonathan M Davis
> 
> He called it a FUD?

I don't think that he used quite that term, but his point was that I shouldn't 
be saying that, because it wasn't true, and so I was spreading incorrect 
information (that and the fact that he was tired of people spreading that 
incorrect information, IIRC). I can't find the exact post at the moment though.

> I guess we can meet somewhere in the middle. Btw. did
> I mix up Don and Brad in the last post ? Who is working on the
> memory management ?

I don't think that you mixed anyone up. Don works primarily on CTFE. Brad 
works primarily on the auto tester and other infrastructure required for the 
dmd/Phobos folks to do what they do.

- Jonathan M Davis


Re: More range woes: composed ranges are unsafe to return from functions

2012-10-18 Thread Andrei Alexandrescu

On 10/19/12 8:17 AM, H. S. Teoh wrote:

On Fri, Oct 19, 2012 at 04:24:53AM +0200, Marco Leise wrote:

Am Tue, 16 Oct 2012 17:28:47 -0700
schrieb "H. S. Teoh":


On Tue, Oct 16, 2012 at 09:47:36PM +0200, jerro wrote:

Hmm. There *is* a delegate being passed to map(). Would that
cause problems? Theoretically it shouldn't, but as you said, if
dmd isn't handling it correctly that could cause problems.


I'm looking at the disassembly of cprod
(http://pastebin.com/ngTax6B8) and there doesn't seem to be a call
to _d_allocmemory in it. AFAIK it should be if the memory for the
variables that the delegate uses was allocated on the heap?


Filed bug:

http://d.puremagic.com/issues/show_bug.cgi?id=8832

Whew, what a day! Two compiler bugs, no less, and a whole bunch of
Phobos issues. I think I may need to take a break from D for a day
or two. :-/

[...]

And that's where all the good projects end... :D

[...]

Actually, I just went back to working on my personal D project for a
bit. I was a bit disappointed that what I thought would be a quick
side-job (implement cartesianProduct in std.algorithm) turned out to get
stymied by compiler bugs and Phobos issues.


Admittedly cartesianProduct is a nontrivial juxtaposition of quite a few 
other artifacts. The question here is whether this is just endless churn 
or real progress. I'm optimistic, but am curious about others' opinion.


[snip]

So yes, D still has a ways to go, and it does have its warts, but it's
heaven compared to C++.


One question is how it compares against other languages that foster 
similar bulk processing, such as C# or Scala.



Andrei


Re: Calling conventions

2012-10-18 Thread Jacob Carlborg

On 2012-10-18 21:56, David Nadlinger wrote:


There is no single »C calling convention«, it varies between different
OSes and architectures. But yes, as far as I know the calling convention
used by GDC for extern(D) is the same as GCC (the C compiler) defaults to.


Hence the "or whatever the calling convention used by the system". I 
know that VC on Windows uses a different calling convention (__stdcall), 
don't know that mingw uses.



All those differences should definitely be fixed, at least for x86_64
and future platforms like ARM, because an unified ABI is definitely a
very good thing to have – C++ continues to suffer dearly for not
specifying one. But this is only going to happen if all three compiler
implementations are actively working together.


I think it would be great if those working with the compilers, LDC, GDC 
and Walter (DMD) would discuss this and try to fix it. Currently Walter 
only seems to care about the targets supported by DMD and it's up to the 
other compiler developers to invent their own ABI for the other 
platforms. What happens then when a new platform is added to DMD, will 
Walter follow the other compilers (are the even consistent?) or role his 
own ABI.


--
/Jacob Carlborg


Re: Shared keyword and the GC?

2012-10-18 Thread Alex Rønne Petersen

On 17-10-2012 16:26, deadalnix wrote:

Why not definitively adopt the following (and already proposed) memory
scheme (some practice are now considered valid when this scheme is not
respected) :

Thread local head (one by thread) -> shared heap -> immutable heap

This model have multiple benefices :
  - TL heap only can be processed by only interacting with one thread.
  - immutable head can be collected 100% concurently if we allow some
floating garbage.
  - shared heap is the only problem, but as its size stay small, the
problem stay small.


Can you elaborate? I'm not sure I understand.

--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org


Re: Shared keyword and the GC?

2012-10-18 Thread Alex Rønne Petersen

On 17-10-2012 13:51, Jacob Carlborg wrote:

On 2012-10-17 10:55, Alex Rønne Petersen wrote:


Let's step back for a bit and think about what we want to achieve with
thread-local garbage collection. The idea is that we look only at a
single thread's heap (and stack/registers, of course) when doing a
collection. This means that we can -- theoretically -- stop only one
thread at a time and only when it needs to be stopped. This is clearly a
huge win in scalability and raw speed. With a scheme like this, it might
even be possible to get away with a simple mark-sweep or copying GC per
thread instead of a complicated generational GC, mainly due to the
paradigms the isolation model induces.

Rust, as it is today, can do this. Tasks (or threads if you will -
though they aren't the same thing) are completely isolated. Types that
can potentially contain pointers into a task's heap cannot be sent to
other tasks at all. Rust also does not have global variables.

So, let's look at D:

1. We have global variables.
1. Only std.concurrency enforces isolation at a type system level; it's
not built into the language, so the GC cannot make assumptions.
1. The shared qualifier effectively allows pointers from one thread's
heap into another's.

It's important to keep in mind that in order for thread-local GC (as
defined above) to be possible at all, *under no circumstances whatsoever
must there be a pointer in one thread's heap into another thread's heap,
ever*. If this happens and you apply the above GC strategy (stop one
thread at a time and scan only that thread's heap), you're effectively
dealing with something very similar to the lost object problem on
concurrent GC.

To clarify with regards to the shared qualifier: It does absolutely
nothing. It's useless. All it does is slap a pretty "I can be shared
arbitrarily across threads" label on a type. Even if you have this
knowledge in the GC, it's not going to help you, because you *still*
have to deal with the problem that arbitrary pointers can be floating
around in arbitrary threads.

(And don't even get me started on the lack of clear semantics (and even
the few semi-agreed-upon but flawed semantics) for shared.)


All TLS data is handled by collectors running in their one single
thread, as you describe above. Any non-TLS data is handled the same way
as the GC currently works.

This is how the, now deprecated, Apple GC used by Objective-C works.



How does it deal with the problem where a pointer in TLS points to 
global data, or worse yet, a pointer in the global heap points to TLS?


I'm pretty sure it can't without doing a full pass over the entire heap, 
which seems to me like it defeats the purpose.


But I may just be missing out on some restriction (type system or 
whatever) Objective-C has that makes it feasible.


--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org