Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Daniel Keep


Vladimir Panteleev wrote:
> On Mon, 01 Jun 2009 02:21:33 +0300, Andrei Alexandrescu 
>  wrote:
> 
>> To argue that convincingly, you'd need to disable conversions from  
>> arrays of class objects to void[].
> 
> You're right. Perhaps implicit cast of reference types to void[] should 
> result in an error.

If only there were a way to indicate that void[]s could contain
pointers, then they would behave uniformly across types...

Oh wait.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright  
wrote:

> Vladimir Panteleev wrote:
>> std.boxer is actually a valid counter-example for my post.
>> The specific fix is simple: replace the void[] with void*[].
>> The generic "fix" is just to add a line to  
>> http://www.digitalmars.com/d/garbage.html adding that hiding your only  
>> reference in a void[] results in undefined behavior. I don't think this  
>> should be an inconvenience to any projects?
>
> What do you use for "may contain unaligned pointers"?

Sorry, what do you mean? I don't understand why such a type is needed? 
Implementing support for scanning memory ranges for unaligned pointers will 
slow down the GC even more.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 02:21:33 +0300, Andrei Alexandrescu 
 wrote:

> To argue that convincingly, you'd need to disable conversions from  
> arrays of class objects to void[].

You're right. Perhaps implicit cast of reference types to void[] should result 
in an error.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

That's a subjective opinion :) I could just as well continue arguing
that void[] is the perfect type for any kind of "opaque" binary data
due to its properties.


To argue that convincingly, you'd need to disable conversions from 
arrays of class objects to void[].


Andrei


Re: visualization of language benchmarks

2009-05-31 Thread Nick Sabalausky
"Denis Koroskin" <2kor...@gmail.com> wrote in message 
news:op.uuthxivwo7c...@soldat.creatstudio.intranet...
> On Mon, 01 Jun 2009 03:21:42 +0400, Tim Matthews  
> wrote:
>
>> Knud Soerensen wrote:
>>> Tim Matthews wrote:
>
> It's things like this that make me want to get into visualization.
> Great article!

 Where's the D
>>>  It is on 3,3 called Dlang.
>>>
>>
>> OK it is was on the 05 chart but I was expecting it to be on the updated
>> 09 chart though. They seem to believe D is less of a player now.
>
> IIRC, there was no stable 64bit D compiler for Linux at the moment they 
> moved to new hardware and thus D support was dropped.

So they're benchmarks are only accurate for 64-bit? 




Re: Source control for all dmd source (Git propaganda =)

2009-05-31 Thread Leandro Lucarella
Daniel Keep, el  1 de junio a las 13:22 me escribiste:
> It's beautiful and I'm now looking to convert my existing projects in
> svn over to Git because it's just so much better.

Usability features of Git are just unique, I never saw other SCMs more
usable than Git. Even when mercurial is nice, I miss Git so much when
I use it, it hurts! =)

A trivial example of this is all git commands with long outputs (like diff
or log) runs through a pager (*if* their output don't fit the current
console). This feature makes a *huge* difference in the day to day work.
Colored output is great too, and the local/remote branching management is
really great. Not to mention git rebase... What a command!

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

Nos retiramos hasta la semana que viene reflexionando sobre nuestras
vidas: "Qué vida de mier'... Qué vida de mier'!"
-- Sidharta Kiwi


Re: Source control for all dmd source

2009-05-31 Thread Daniel Keep


Jérôme M. Berger wrote:
> Leandro Lucarella wrote:
>> Well, that's great to hear! As Robert said, please tell if you need any
>> help. Please, please, please consider using a distributed SCM (I think
>> git
>> would be ideal but mercurial is good too). That would make merging and
>> branching a lot easier.
>>
> Git is a bad choice because of its poor Windows support. Mercurial
> or Bazaar are much better in this regard.
> 
> Jerome

Lies.

I've been using it daily for about a month and a half now.  I've got
both TortoiseGIT and msysGit installed.  Both were installed with
regular Windows installers that required me to click "Next" a few times
and maybe a "Finish" or two.

You DO NOT need to run Git from inside a weird shell or anything; git
commands work directly from the standard sucky Windows command-line.
And it appears to be quite fast, too.  Hell, it seems to be even faster
than svn.

It's beautiful and I'm now looking to convert my existing projects in
svn over to Git because it's just so much better.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Christopher Wright

Lionello Lunesu wrote:

Denis Koroskin wrote:
On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
 wrote:
 
I just went through a ~15000-line project and replaced most 
occurrences  of void[]. Now the project is an ugly mess of void[], 
ubyte[] and casts,  but at least it doesn't leak memory like crazy 
any more.
 
I don't know why it was decided to mark the contents of void[] as 
"might  have pointers". It makes no sense!
 
 
FWIW, I also consider void[] as a storage for an arbitrary untyped binary

 > data, and thus I believe GC shouldn't scan it.

You're contradicting yourself there. void[] is arbitrary untyped data, 
so it could contain uints, floats, bytes, pointers, arrays, strings, 
etc. or structs with any of those.


I think the current behavior is correct: ubyte[] is the new void*.


Even in C, people often use unsigned char* for arbitrary data that does 
not include pointers.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Christopher Wright

Vladimir Panteleev wrote:

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic "fix" is just to add a line to 
http://www.digitalmars.com/d/garbage.html adding that hiding your only reference in a 
void[] results in undefined behavior. I don't think this should be an inconvenience to 
any projects?


What do you use for "may contain unaligned pointers"?


Re: Source control for all dmd source

2009-05-31 Thread Christopher Wright

Leandro Lucarella wrote:

"Jérôme M. Berger", el 31 de mayo a las 19:03 me escribiste:

Leandro Lucarella wrote:

Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

Git is a bad choice because of its poor Windows support. Mercurial or 
Bazaar are much better in this regard.


That's a mith, Git is pretty much supported in Windows now. I know people
that uses it in a regular basis. See:
http://code.google.com/p/msysgit/


I've used tortoise-git: http://code.google.com/p/tortoisegit/

It worked pretty well, though I didn't spend much time with it.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Lionello Lunesu

Denis Koroskin wrote:

On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
 wrote:
 
I just went through a ~15000-line project and replaced most occurrences  
of void[]. Now the project is an ugly mess of void[], ubyte[] and casts,  
but at least it doesn't leak memory like crazy any more.
 
I don't know why it was decided to mark the contents of void[] as "might  
have pointers". It makes no sense!
 
 
FWIW, I also consider void[] as a storage for an arbitrary untyped binary

> data, and thus I believe GC shouldn't scan it.

You're contradicting yourself there. void[] is arbitrary untyped data, 
so it could contain uints, floats, bytes, pointers, arrays, strings, 
etc. or structs with any of those.


I think the current behavior is correct: ubyte[] is the new void*.

I also agree that std.file.read (and similar functions) should return 
ubyte[] instead of void[], to prevent surprises after concatenation.


L.


Re: Automatic void initialization

2009-05-31 Thread BCS

Hello Fractal,


Hello

if I have the following code:

int foo;
foo = 5;
When the variable foo is declared, it is initialized to int.init, or
has garbage contents until it is assigned?

Thanks



int.init unless the compiler rewrites it as

int foo = 5;

and then 5.




Re: visualization of language benchmarks

2009-05-31 Thread BCS

Hello Denis,


On Mon, 01 Jun 2009 03:04:14 +0400, BCS  wrote:


Hello Denis,


I wonder where ASM would be located :p


top left.


I highly doubt hand-written assembly for those tasks will be anywhere
close to optimal.

I bet it would be in top right corner.



There are only two cases where ASM should be used; 1) where you need access 
to specific op codes that the language doesn't expose and 2) where it needs 
to be faster than what you can otherwise get in any avalable languge. Based 
on that, you will never see it anywhere BUT the left edge. For that matter, 
if you aren't on the left edge, take whatever is and disassemble it and now 
you are.





Re: Automatic void initialization

2009-05-31 Thread Jason House
Fractal wrote:

> Hello
> 
> if I have the following code:
> 
> int foo;
> foo = 5;
> 
> When the variable foo is declared, it is initialized to int.init, or has
> garbage contents until it is assigned?
> 
> Thanks

D will always initialize variables unless you explicitly tell it not to. 
(although a smart compiler may optimize certain cases)

Here's how to get foo initialized to garbage:
int foo = void;
foo = 5;



Automatic void initialization

2009-05-31 Thread Fractal
Hello

if I have the following code:

int foo;
foo = 5;

When the variable foo is declared, it is initialized to int.init, or has 
garbage contents until it is assigned?

Thanks


Re: Source control for all dmd source

2009-05-31 Thread Leandro Lucarella
"Jérôme M. Berger", el 31 de mayo a las 22:49 me escribiste:
> hasen wrote:
> >Leandro Lucarella wrote:
> >>"Jérôme M. Berger", el 31 de mayo a las 19:03 me escribiste:
> >>>Leandro Lucarella wrote:
> Well, that's great to hear! As Robert said, please tell if you need any
> help. Please, please, please consider using a distributed SCM (I think git
> would be ideal but mercurial is good too). That would make merging and
> branching a lot easier.
> >>>Git is a bad choice because of its poor Windows support. Mercurial or 
> >>> Bazaar are much better in this regard.
> >>
> >>That's a mith, Git is pretty much supported in Windows now. I know people
> >>that uses it in a regular basis. See:
> >>http://code.google.com/p/msysgit/
> >>
> >Yea, I use git on windows (msysgit), works like a charm.
> >+1 for git
>   You just have to look at the name to see that it doesn't really
>   work on Windows: it requires a Unix emulation layer (msys in your
>   case, others do it with cygwin) which a lot of people won't want
>   to install/use.

That project provides an easy installer that have all you need to run git
(in 10MB, I wonder how much software is available for Windows with an
installer of only 10MB).

Why do you care if it installs some "Unix emulation layer" (which are just
some tools common in Unix OS, is not that Git runs in a VM or something)?
You don't even need to install it if you don't want, you can use the
"portable app" package =)

It does works on Windows.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

This is what you get,
when you mess with us.


Unofficial wish list status.(Jun 2009)

2009-05-31 Thread 4tuu4k002

Hi

This is the monthly status for the unofficial d wish list: 
http://all-technology.com/eigenpolls/dwishlist/

Right now the wish list looks like this:

202  Stack tracing (#26)
197  Reflection API (#6)
132  vectorization (#10)
113  Multiple return values (tuples (#28)
100  Multiple opCast per class (#24)
91  Debug check for null reference (#52)
86  Native AMD64 codegen (#36)
78  Short syntax for new (#18)
75  !in (#44)
74  unit test after compilation (#1)
70  extra compiler values (#19)
64  Return-type overloading (#49)
58  Explicit out/inout (#38)
54  Foreach on first/on last (#42)
53  Unit test isolation  (#2)
50  Posix threads support native (#3)
46  Array pushback/popback (#51)
46  Variadic arguments re-passing (#102)
45  Array masking (#11)
45  Consistent struct/class sizeof (#40)
44  better syntax for cast (#23)
42  Explicit type initializers (#35)
42  L-Value return (#73)
37  Named keyword arguments (#87)
36  black box unit testing (#8)
36  struct constructor (#97)
35  associative arrays by index (#16)
35  Non-Static isExpression (#37)
34  unit test & code separation (#7)
34  Explicit module `friendship` (#43)
33  coherent assoc. array syntax (#20)
33  Pass value params byref (#34)
33  auto-member objects (#45)
31  Conditional syncronized (#30)
29  Unit test measurements (#9)
29  Explicit property keyword (#83)
27  Inline enum declaration (#76)
26  Renaming ctor/dtor (#17)
24  User-defined sync function (#31)
24  Small Exectables (#88)
23  interface to C++ (#71)
22  proper cast operators (#21)
22  Iterators and Generators (#58)
22  Pascal like sets (#61)
22  if, while, true, false, int (#86)
21  Built-in variant type (#56)
20  Precise names for floats (#62)
18  range type (#106)
17  D library contest (#59)
17  No Postfix Array Declarations (#85)
17  Full lexical closures (#140)
17  Real C bitfields (#145)
15  garbage collection switch  (#96)
15  Multi-Dimensional Allocation (#109)
14  conv() and opConv (#66)
14  Finite sets (#72)
14  opCast overloading (#81)
14  modules must not rely on files (#84)
14  copy operator (#95)
13  Call log (#47)
13  Improve module architecture (#64)
13  Meta Information (#69)
11  inout variable and return (#60)
11  imag and comp FP types. (#63)
11  inline expansion (#67)
10  Against class instance sizeof (#48)
10  function inheritance (#92)
10  In flight exception detection (#101)
9  Parallel Scavenging GC (#80)
9  in for arrays (#160)
9  Get rid of const (#165)
8  Relational class/array algebra (#65)
8  Statically check for == null (#98)
8  in for arrays (#161)
8  throws keyword (#173)
7  support struct&array in switch (#99)
7  date/time/datetime literal (#105)
7  void Class.Method() {} syntax (#146)
7  static foreach(scope/unscope) (#152)
6  Declaration in function calls (#74)
6  array in template arguments (#91)
6  Efficient array opCatAssign (#148)
6  Tango to work with D2 (#179)
5  Explicit out/inout/lazy (#110)
5  Better UTF32 Support (#113)
5  First-class continuations (#141)
5  Implicit New (#143)
5  suffix identifiers. (#168)
4  named tuple (#103)
4  tuple literal and append-op (#151)
4  ext property for  basic types (#154)
4  {Cleaner Operator Overloading} (#166)
4  Property declarator (#174)
4  Power operator (#177)
3  System.Windows.Forms (#93)
3  Reallocation Keyword (#108)
3  function call over network (#111)
3  Property shortcut (#144)
3  variable template(short syntax (#149)
3  template literal (#150)
3  Custom Attributes (#159)
3  templated constructors (#164)
3  New Switch Case Design (#170)
3  Remove const (#171)
3  Remove const (#172)
3  Voting in bugzilla for D. (#176)
2  Manage .resources files (#70)
2  Multistep return (#75)
2  constant operater overloading (#100)
2  solve interdepend static this (#107)
2  Quick For Syntax (#142)
2  invariant function (#156)
2  constant member functions (#158)
2  Keyword Pow Operator (#162)
2  Custom Syntax (#163)
2  C++ Member Pointers (#167)
2  Enum string cast (#178)
2  Template inst. syntax: <> (#182)
1  consistant new (#77)
1  temp alias param specialize (#112)
1  remove initializers (#147)
1  __traits (#153)
1  temporary variable (#155)
1  Dynamic Conditional (#157)
1  Better Array Function Template (#169)
1  Remove SFINAE (#175)
1  Auto const member funcs (#180)
1  Overlapping array copy (#181)
1  Template inst. syntax: <> (#183)
1  Template inst. syntax: <> (#184)
1  Invariant => invar (#185)
1  similar templt/function syntax (#186)
1  classes on stack (or ROM) (#188)
1  Easy threading a la OpenMP (#189)
0  -nogc option (#187)
0  link exchange request (#190)
0  link exchange request (#191)


Re: visualization of language benchmarks

2009-05-31 Thread Denis Koroskin
On Mon, 01 Jun 2009 03:21:42 +0400, Tim Matthews  
wrote:

> Knud Soerensen wrote:
>> Tim Matthews wrote:

 It's things like this that make me want to get into visualization.
 Great article!
>>>
>>> Where's the D
>>  It is on 3,3 called Dlang.
>>
>
> OK it is was on the 05 chart but I was expecting it to be on the updated  
> 09 chart though. They seem to believe D is less of a player now.

IIRC, there was no stable 64bit D compiler for Linux at the moment they moved 
to new hardware and thus D support was dropped.


Re: visualization of language benchmarks

2009-05-31 Thread Tim Matthews

Knud Soerensen wrote:

Tim Matthews wrote:


It's things like this that make me want to get into visualization.
Great article!


Where's the D


It is on 3,3 called Dlang.



OK it is was on the 05 chart but I was expecting it to be on the updated 
09 chart though. They seem to believe D is less of a player now.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu
 wrote:


But I think this is too much ado about nothing - you're avoiding
the type system to start with, so use ubyte, insert a cast, and
call it a day.


I don't get it - not using casts is avoiding the type system? :P Note
that I am NOT up-casting the void[] later back to some other type -
it goes out to the network, a file, etc. void[] sounds like it fits
perfectly in the type hierarchy for "just a bunch of bytes", except
for the "may contain pointers" fine print.


I understand. You are sending around object representation. void[] may 
contain pointers, so you're simply not looking at the right abstraction.



If you have too many casts, the problem is most likely elsewhere so
that argument I'm not buying.


I could cut down on the number of casts if I were to replace most
array appending operations to calls to a function that takes a void[]
and then internally casts to an ubyte[] and appends that somewhere.
There's a lot of diversity of types being worked with in my case -
strings, various structs, more raw data, etc. I'm more annoyed that
I'd need to do something like that to work around a design decision
that may not have been fully thought out.


Walter has written a class called OutBuffer (see std.outbuffer) the 
likes of which could be used to encapsulate representation marshaling.


Andrei



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Mon, 01 Jun 2009 00:00:45 +0300, Andrei Alexandrescu 
 wrote:


const(ubyte)[] getRepresentation(T)(T[] data)
{
 return cast(typeof(return)) data;
}


This is functionally equivalent to (forgive the D1):
ubyte[] getRepresentation(void[] data)
{
return cast(ubyte[]) data;
}
Since no allocation is done in this case, the use of void[] is safe, and it 
doesn't instantiate a version of the function for every type you call it with. 
I remarked about this in my other reply.



This is not safe because you can change the data.

Andrei


Re: visualization of language benchmarks

2009-05-31 Thread Denis Koroskin
On Mon, 01 Jun 2009 03:04:14 +0400, BCS  wrote:

> Hello Denis,
>
>
>> I wonder where ASM would be located :p
>>
>
> top left.
>
>

I highly doubt hand-written assembly for those tasks will be anywhere close to 
optimal.

I bet it would be in top right corner.


Re: visualization of language benchmarks

2009-05-31 Thread BCS

Hello Denis,



I wonder where ASM would be located :p



top left.




Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu 
 wrote:

But I think this is too much ado about nothing - you're avoiding the type system to start with, so use ubyte, insert a cast, and call it a day. 


I don't get it - not using casts is avoiding the type system? :P Note that I am NOT up-casting the 
void[] later back to some other type - it goes out to the network, a file, etc. void[] sounds like 
it fits perfectly in the type hierarchy for "just a bunch of bytes", except for the 
"may contain pointers" fine print.


If you have too many casts, the problem is most likely elsewhere so that 
argument I'm not buying.


I could cut down on the number of casts if I were to replace most array 
appending operations to calls to a function that takes a void[] and then 
internally casts to an ubyte[] and appends that somewhere. There's a lot of 
diversity of types being worked with in my case - strings, various structs, 
more raw data, etc. I'm more annoyed that I'd need to do something like that to 
work around a design decision that may not have been fully thought out.



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread bearophile
Vladimir Panteleev:
> Consider this really basic example of file concatenation:
> auto data = read("file1") ~ read("file2"); // oops! void[] concatenation - 
> minefield created

I think a better design for that read() function is to return ubyte[].
I have never understood why it returns a void[].
To manage generic data ubyte is better than void[] in your program (sometimes 
uint[] is useful to increase efficiency compared to ubyte[]).

Bye,
bearophile


Re: visualization of language benchmarks

2009-05-31 Thread Denis Koroskin
On Mon, 01 Jun 2009 02:11:39 +0400, Jarrett Billingsley 
 wrote:

> On Sun, May 31, 2009 at 5:31 PM, Knud Soerensen
> <4tuu4k...@sneakemail.com> wrote:
>> Tim Matthews wrote:

 It's things like this that make me want to get into visualization.
 Great article!
>>>
>>> Where's the D
>>
>> It is on 3,3 called Dlang.
>
> It seems to be pretty close to the "ideal" corner at that ;)

Yeah, noticeably closer that Java.

I wonder where ASM would be located :p


Re: visualization of language benchmarks

2009-05-31 Thread Jarrett Billingsley
On Sun, May 31, 2009 at 5:31 PM, Knud Soerensen
<4tuu4k...@sneakemail.com> wrote:
> Tim Matthews wrote:
>>>
>>> It's things like this that make me want to get into visualization.
>>> Great article!
>>
>> Where's the D
>
> It is on 3,3 called Dlang.

It seems to be pretty close to the "ideal" corner at that ;)


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright  
wrote:

> Hmm. Wouldn't compression data be naturally a ubyte[] type?

(again, something I forgot to add... shouldn't hit Send so soon)

Consider this really basic example of file concatenation:

auto data = read("file1") ~ read("file2"); // oops! void[] concatenation - 
minefield created

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright  
wrote:

> Because one is an obvious failure, and the other will be memory  
> corruption. Memory corruption is pernicious and awful.

I wanted to add that debugging memory corruptions and other memory problems for 
D right now is complicated due to lack of proper tools in this area. Hopefully 
this will change in the near future.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright  
wrote:

> Vladimir Panteleev wrote:
>> I just realized that by "performance" you might have meant memory
>> leaks.
>
> No, in this context I meant improving performance by not scanning the  
> void[] memory for pointers.

>> Well, sure, if you can say that my programs crashing every few
>> hours due to running out of memory is a "performance" problem. I'm
>> sorry to sound bitter, but this was the cause of much annoyance for
>> my software's users. It took me to write a memory debugger to
>> understand that no matter how much you chase void[]s with
>> hasNoPointers, there will always be that one ~ which you overlooked.
>
> I'm curious what form of data you have that always seem to look like  
> valid pointers. There are a couple other options you can pursue - moving  
> the gc pool to another location in the address space, or changing the  
> alignment of your void[] data so it won't look like aligned pointers  
> (the gc won't look for misaligned pointers).

It's just compressed data, which is evenly distributed across the 32-bit 
address space. Let's do the math:

Suppose we have an application which has two blocks of memory, M and N. Block M 
is a block with random data which is erroneously marked as having pointers, 
while block N is a block which shouldn't have any pointers towards it.
Now, the chance that a random DWORD will point inside N is 
sizeof(N)/0x1 - or rather, we can say that it will NOT point inside N 
with the probability of 1-(sizeof(N)/0x1). For as many DWORDs as there 
are in M, raise that to the power sizeof(M)/4. For values already as small as 1 
MB for M and N, it's pretty much guaranteed that you'll have pointers inside N. 
Relocating or re-aligning the data won't help - it won't affect the entropy or 
the value range.

> Or just use ubyte[] instead.

And the casts that come with it :(

>> As much as I try to look from an objective perspective, I don't see
>> how a memory leak (and memory leaks in D usually mean that NO memory
>> is being freed, except for small lucky objects not having bogus
>> pointers to them) is a problem less significant than an obscure case
>> that involves allocating a void[], storing a pointer in it and losing
>> all other references to the object.
>
> Because one is an obvious failure, and the other will be memory  
> corruption. Memory corruption is pernicious and awful.

It is, yes. But if you add "don't put your only references inside void[]s" to 
the "don'ts" on the GC page, the programmer will only have himself to blame for 
not reading the language documentations. This goes right along with other 
tricks IMHO.

>> In fact, I just searched the D
>> documentation and I couldn't find a statement saying whether void[]
>> are scanned by the GC or not. Enter mr. D-newbie, who wants to write
>> his own network/compression/file-copying/etc. library/program and
>> stumbles upon void[], the seemingly perfect
>> abstract-binary-data-container type for the job... (which is exactly
>> what happened with yours truly).
>>  P.S. Not trying to push my point of view, but just trying to offer
>> some perspective from someone who has been bit by this design
>> choice...
>
> Hmm. Wouldn't compression data be naturally a ubyte[] type?

That's a subjective opinion :) I could just as well continue arguing that 
void[] is the perfect type for any kind of "opaque" binary data due to its 
properties.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Walter,


I'm curious what form of data you have that always seem to look like
valid pointers. There are a couple other options you can pursue -
moving the gc pool to another location in the address space, or
changing the alignment of your void[] data so it won't look like
aligned pointers (the gc won't look for misaligned pointers).



Most (but not all) of the cases I can think of where you get false pointers, 
re-aligning stuff or moving the heap won't help as the false pointer source 
will hit the full address space.





Re: static this sucks, we should deprecate it

2009-05-31 Thread BCS

Hello Walter,


That's why, for example, airplanes have things that must be
removed before flight attached to big red flags that hang outside. You
don't really want to find out after you're airborne that your pitot
tubes still have the dust cap on!



Or even better:

http://www.aircraftspruce.com/catalog/graphics/10-02000.jpg

the local airport added these to it's rentals after a few aborted takeoffs 
(from bugs in the tubes).





Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:00:45 +0300, Andrei Alexandrescu 
 wrote:

> const(ubyte)[] getRepresentation(T)(T[] data)
> {
>  return cast(typeof(return)) data;
> }

This is functionally equivalent to (forgive the D1):
ubyte[] getRepresentation(void[] data)
{
return cast(ubyte[]) data;
}
Since no allocation is done in this case, the use of void[] is safe, and it 
doesn't instantiate a version of the function for every type you call it with. 
I remarked about this in my other reply.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: visualization of language benchmarks

2009-05-31 Thread Knud Soerensen

Tim Matthews wrote:


It's things like this that make me want to get into visualization.
Great article!


Where's the D


It is on 3,3 called Dlang.

--
Join me on
CrowdNews  http://crowdnews.eu/users/addGuide/42/
Facebook   http://www.facebook.com/profile.php?id=1198821880
Linkedin   http://www.linkedin.com/pub/0/117/a54
Mandalahttp://www.mandala.dk/view-profile.php4?profileID=7660


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Walter Bright

Vladimir Panteleev wrote:

I just realized that by "performance" you might have meant memory
leaks.


No, in this context I meant improving performance by not scanning the 
void[] memory for pointers.



Well, sure, if you can say that my programs crashing every few
hours due to running out of memory is a "performance" problem. I'm
sorry to sound bitter, but this was the cause of much annoyance for
my software's users. It took me to write a memory debugger to
understand that no matter how much you chase void[]s with
hasNoPointers, there will always be that one ~ which you overlooked.


I'm curious what form of data you have that always seem to look like 
valid pointers. There are a couple other options you can pursue - moving 
the gc pool to another location in the address space, or changing the 
alignment of your void[] data so it won't look like aligned pointers 
(the gc won't look for misaligned pointers).


Or just use ubyte[] instead.


As much as I try to look from an objective perspective, I don't see
how a memory leak (and memory leaks in D usually mean that NO memory
is being freed, except for small lucky objects not having bogus
pointers to them) is a problem less significant than an obscure case
that involves allocating a void[], storing a pointer in it and losing
all other references to the object.


Because one is an obvious failure, and the other will be memory 
corruption. Memory corruption is pernicious and awful.



In fact, I just searched the D
documentation and I couldn't find a statement saying whether void[]
are scanned by the GC or not. Enter mr. D-newbie, who wants to write
his own network/compression/file-copying/etc. library/program and
stumbles upon void[], the seemingly perfect
abstract-binary-data-container type for the job... (which is exactly
what happened with yours truly).

P.S. Not trying to push my point of view, but just trying to offer
some perspective from someone who has been bit by this design
choice...


Hmm. Wouldn't compression data be naturally a ubyte[] type?



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu 
 wrote:

> But I think this is too much ado about nothing - you're avoiding the type 
> system to start with, so use ubyte, insert a cast, and call it a day. 

I don't get it - not using casts is avoiding the type system? :P Note that I am 
NOT up-casting the void[] later back to some other type - it goes out to the 
network, a file, etc. void[] sounds like it fits perfectly in the type 
hierarchy for "just a bunch of bytes", except for the "may contain pointers" 
fine print.

> If you have too many casts, the problem is most likely elsewhere so that 
> argument I'm not buying.

I could cut down on the number of casts if I were to replace most array 
appending operations to calls to a function that takes a void[] and then 
internally casts to an ubyte[] and appends that somewhere. There's a lot of 
diversity of types being worked with in my case - strings, various structs, 
more raw data, etc. I'm more annoyed that I'd need to do something like that to 
work around a design decision that may not have been fully thought out.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Andrei,


BCS wrote:


so use ubyte, insert a cast, and call it a day. If you have too many
casts, the problem is most likely elsewhere


You might be correct, but I don't think any of us have enough info
right now to make that assertion.


Oh there is enough information. What's needed is:

const(ubyte)[] getRepresentation(T)(T[] data)
{
return cast(typeof(return)) data;
}
If you have many calls to getRepresentation(), then that
anticlimatically shows that you need to look at arrays'
representations often. If there are too many of those, maybe some of
the said arrays should be dealt with as ubyte[] in the first place.


Maybe in some cases but if the primary function of the code is processing 
stuff between "raw data" and other data types than the above is irrelevant. 
The OP sort of hinted somewhere that this is the kind of thing he is working 
on. Without knowing what the OP is doing, I still don't think we can say 
if his program is well designed.





Re: static this sucks, we should deprecate it

2009-05-31 Thread Walter Bright

Christopher Wright wrote:
Eh, this would have to extend to every function, since static ctors can 
call functions. And these functions can be provided without 
implementations via a .di file. This is fail.


Such problems are called "whole program analysis", or "interprocedural 
analysis". There are a lot of cool things you can do with that, but of 
course they require 100% of the program text to be available to the 
compiler.


That isn't going to happen with D (even if all the D source were 
available, what about calling C binaries?). So we have to rely on other 
mechanisms.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 23:11:57 +0300, grauzone  wrote:

>> 3) It's very rare in practice that the only pointer to your object  
>> (which you still plan to access later) to be stored in a  
>> void[]-allocated array! Remember, the properties of memory regions are  
>> determined when the memory is allocated, so casting an array of  
>> structures to a void[] will not lose you that reference. You'd need to  
>> move your pointer to a void[]-array (which you need to allocate  
>> explicitly or, for example, concatenating your reference to the  
>> void[]), then drop the reference to your original structure, for this  
>> to happen.
>
> void[] = can contain pointers
> ubyte[] = can not contain pointers
>
> void[] just wraps void*, which is a low level type and can contain  
> anything. Because of that, the conservative GC needs to scan it for  
> pointers. ubyte[], on the other hand, contains sequences of 8 bit  
> integers. For untyped binary data, ubyte[] is the most correct type.
>
> You want to send it over network or write it into a file? Use ubyte[].  
> The data will never contain any pointers. You want to play low level  
> tricks, that involve copying around arbitrary memory contents (like  
> boxing, see std.boxer)? Use void[].

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic "fix" is just to add a line to 
http://www.digitalmars.com/d/garbage.html adding that hiding your only 
reference in a void[] results in undefined behavior. I don't think this should 
be an inconvenience to any projects?

> You shouldn't cast structs or any other types to ubyte[], because the  
> memory representation of those type is highly platform specific. Structs  
> can contain padding, integers are endian dependend... If you want to  
> convert these to binary data, write a marshaller. You _never_ want to do  
> direct casts, because they're simply unportable. If you do the cast, you  
> have to know what you're doing.

Thanks for the advice, but I actually know what I'm doing. Unlike C, D's 
structure alignment rules are actually part of the specification. If I wanted 
my programs to be safe/cross-platform/etc. regardless of execution speed, I'd 
use a scripting or VM-ed language.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: static this sucks, we should deprecate it

2009-05-31 Thread Walter Bright

BCS wrote:
An executable that never works and fails with a resonable error message 
at startup is *loads* better than one that either silently runs 
incorrectly (generates bad results) or erratically fails. I see a 
*major* difference.


Not just a major difference, but a fundamental one.

A failure that happens obviously and repeatably is fundamentally 
different from a failure that goes unnoticed or is not repeatable.


A basic principle of developing robust systems is to make any failures 
obvious. That's why, for example, airplanes have things that must be 
removed before flight attached to big red flags that hang outside. You 
don't really want to find out after you're airborne that your pitot 
tubes still have the dust cap on!


Re: static this sucks, we should deprecate it

2009-05-31 Thread Walter Bright

Denis Koroskin wrote:

Which is even worse. Walter stated that "silently generating bad
code" (i.e. code that doesn't work) is a top priority bug.


It fails immediately on trying to run it, it is not silently failing.

Silently failing is having a dependency on the order of initialization, 
but not detecting it, and initializing things in the wrong order.


Re: visualization of language benchmarks

2009-05-31 Thread Tim Matthews

Jarrett Billingsley wrote:

On Sun, May 31, 2009 at 1:14 PM, Knud Soerensen
<4tuu4k...@sneakemail.com> wrote:

Check the nice article on

http://gmarceau.qc.ca/blog/2009/05/speed-size-and-dependability-of.html


--
Join me on
CrowdNews  http://crowdnews.eu/users/addGuide/42/
Facebook   http://www.facebook.com/profile.php?id=1198821880
Linkedin   http://www.linkedin.com/pub/0/117/a54
Mandalahttp://www.mandala.dk/view-profile.php4?profileID=7660



It's things like this that make me want to get into visualization.
Great article!


Where's the D


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

BCS wrote:

so use ubyte, insert a cast, and call it a day. If you have too
many casts, the problem is most likely elsewhere


You might be correct, but I don't think any of us have enough info right 
now to make that assertion.


Oh there is enough information. What's needed is:

const(ubyte)[] getRepresentation(T)(T[] data)
{
return cast(typeof(return)) data;
}

If you have many calls to getRepresentation(), then that 
anticlimatically shows that you need to look at arrays' representations 
often. If there are too many of those, maybe some of the said arrays 
should be dealt with as ubyte[] in the first place.



Andrei


Re: static this sucks, we should deprecate it

2009-05-31 Thread Walter Bright

Jarrett Billingsley wrote:

On Sun, May 31, 2009 at 4:39 PM, Walter Bright
 wrote:

The solution is relatively robust and straightforward. Create a third
module, AB. Module A and module B both import AB. Put the static
constructors for both A and B in module AB. The order of initialization
problem is robustly solved, and all the interdependencies of initialization
of A and B are explicitly laid out in AB.


If I might speak from personal experience, what usually ends up
happening instead is that A and B get merged into a single module.
This happens enough times, and you have half your code in one file.


What is wrong with the approach I outlined? I use it, it works fine.



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Denis Koroskin
On Mon, 01 Jun 2009 00:53:02 +0400, BCS  wrote:

> Hello Vladimir,
>
>> I just went through a ~15000-line project and replaced most
>> occurrences of void[]. Now the project is an ugly mess of void[],
>> ubyte[] and casts, but at least it doesn't leak memory like crazy any
>> more.
>>  I don't know why it was decided to mark the contents of void[] as
>> "might have pointers". It makes no sense! Consider:
>>  2) Despite that void[] is "typeless", you can still operate on it -
>> namely, slice and concatenate them. Pass a void[] to a network send()
>> function - how much did you send? Half the buffer? No problem, slice
>> it away and store the rest - and no casts.
>>  3) It's very rare in practice that the only pointer to your object
>> (which you still plan to access later) to be stored in a
>> void[]-allocated array! Remember, the properties of memory regions are
>> determined when the memory is allocated, so casting an array of
>> structures to a void[] will not lose you that reference. You'd need to
>> move your pointer to a void[]-array (which you need to allocate
>> explicitly or, for example, concatenating your reference to the
>> void[]), then drop the reference to your original structure, for this
>> to happen.
>>
>
> I think the idea is that void[] is the most general data type; it can be  
> anything, including pointers.  
> Also for a real world use case where void[]=mightHavePointers is valid,  
> consider a system that reads blocks of data structures from a file and  
> then does in place substation from file references to memory references.  
> You can't allocate buffers of the correct type because you may not even  
> know what that is until you have already loaded the data.
>

In this case you should *explicitly* mark that void[] array as 
"mightHavePointers".


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Vladimir,


I just went through a ~15000-line project and replaced most
occurrences of void[]. Now the project is an ugly mess of void[],
ubyte[] and casts, but at least it doesn't leak memory like crazy any
more.

I don't know why it was decided to mark the contents of void[] as
"might have pointers". It makes no sense! Consider:

2) Despite that void[] is "typeless", you can still operate on it -
namely, slice and concatenate them. Pass a void[] to a network send()
function - how much did you send? Half the buffer? No problem, slice
it away and store the rest - and no casts.

3) It's very rare in practice that the only pointer to your object
(which you still plan to access later) to be stored in a
void[]-allocated array! Remember, the properties of memory regions are
determined when the memory is allocated, so casting an array of
structures to a void[] will not lose you that reference. You'd need to
move your pointer to a void[]-array (which you need to allocate
explicitly or, for example, concatenating your reference to the
void[]), then drop the reference to your original structure, for this
to happen.



I think the idea is that void[] is the most general data type; it can be 
anything, including pointers. 

Also for a real world use case where void[]=mightHavePointers is valid, consider 
a system that reads blocks of data structures from a file and then does in 
place substation from file references to memory references. You can't allocate 
buffers of the correct type because you may not even know what that is until 
you have already loaded the data.




Here's a simple naive implementation of a buffer:

void[] buffer;
void queue(void[] data)
{
buffer ~= data;
}
...
queue([1,2,3][]);
queue("Hello, World!");
No casts! So simple and beautiful. However, should you use this
pattern to work with larger amounts of data with a high entropy, the
"minefield" effect will cause the GC to stop collecting most data.
Sure, you can call std.gc.hasNoPointers, but you need to do it after
every single concatenation... and it makes expressions with more than
one concatenation unsafe.


Yes, when data is being copied into void[] from another type[] it is reasonable 
to ignore pointers but as above, going the other way (IMHO the /common/ case) 
it's not so easy.




I heard that Tango copies over the properties of arrays when they are
reallocated, which helps but solves the problem only partially.

So, I ask you: is there actually code out there that depends on the
way void[] works right now? I brought up this argument a year or so
ago on IRC, and there were people who defended ferociously the current
design using idealisms ("it should work like what it sounds like, it
should contain any type" or something like that), but I've yet to see
a practical argument.


I think that void[] should be left as is but I'm almost ready to throw in 
with the idea that we **need** another type that has the no-cast parts of 
void[] but assume no pointers as well.





Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello grauzone,


You shouldn't cast structs or any other types to ubyte[], because the
memory representation of those type is highly platform specific.
Structs can contain padding, integers are endian dependend... If you
want to convert these to binary data, write a marshaller. You _never_
want to do direct casts, because they're simply unportable. If you do
the cast, you have to know what you're doing.



Never say never. Some cases like tmp files or whatnot where the same exe 
will save and load the file never* have any need for potability.


*"never" uses intentionally :b.




Re: static this sucks, we should deprecate it

2009-05-31 Thread Jarrett Billingsley
On Sun, May 31, 2009 at 4:39 PM, Walter Bright
 wrote:
>
> The solution is relatively robust and straightforward. Create a third
> module, AB. Module A and module B both import AB. Put the static
> constructors for both A and B in module AB. The order of initialization
> problem is robustly solved, and all the interdependencies of initialization
> of A and B are explicitly laid out in AB.

If I might speak from personal experience, what usually ends up
happening instead is that A and B get merged into a single module.
This happens enough times, and you have half your code in one file.

The only way to avoid this is either to create circularly-importing
modules, which are considered bad practice (and also cause DMD's
forward reference bugs to rear their heads), or to completely refactor
your code, which is rarely an attractive option.


Re: Source control for all dmd source

2009-05-31 Thread Jérôme M. Berger

hasen wrote:

Leandro Lucarella wrote:

"Jérôme M. Berger", el 31 de mayo a las 19:03 me escribiste:

Leandro Lucarella wrote:

Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I 
think git

would be ideal but mercurial is good too). That would make merging and
branching a lot easier.
Git is a bad choice because of its poor Windows support. 
Mercurial or Bazaar are much better in this regard.


That's a mith, Git is pretty much supported in Windows now. I know people
that uses it in a regular basis. See:
http://code.google.com/p/msysgit/



Yea, I use git on windows (msysgit), works like a charm.

+1 for git
	You just have to look at the name to see that it doesn't really 
work on Windows: it requires a Unix emulation layer (msys in your 
case, others do it with cygwin) which a lot of people won't want to 
install/use.


Jerome

PS: I usually use Linux at home and Windows at work. I do have msys 
installed on the Windows machine and I don't consider it really 
usable (the only reason I have it is to compile open source 
autoconf-based packages). Mercurial (or Bazaar) work in a standard 
Windows command shell as well as inside a cygwin/msys shell or any 
of the alternative shells if you prefer.

--
mailto:jeber...@free.fr
http://jeberger.free.fr
Jabber: jeber...@jabber.fr



signature.asc
Description: OpenPGP digital signature


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Andrei,


Vladimir Panteleev wrote:


This isn't about performance, this is about having one thousand casts
all over my code. It becomes a burden to cast everything to ubyte[]
when working with abstract binary data. For example, when building a
MIME multipart message with binary fields, every line needs to have a
cast in it - when we could have just used the ~= operator to append
to a void[].


Another alternative would be to allow implicitly casting arrays of any
type to const(ubyte)[] which is always safe.


sounds like something that might work. 


But I think this is too
much ado about nothing - you're avoiding the type system to start
with,


I'm not sure he is (or at least, he is in a very well defined way; "I need 
to look at this data as its bytes")



so use ubyte, insert a cast, and call it a day. If you have too
many casts, the problem is most likely elsewhere


You might be correct, but I don't think any of us have enough info right 
now to make that assertion. 





Re: static this sucks, we should deprecate it

2009-05-31 Thread Walter Bright

Tim Matthews wrote:

Walter Bright wrote:
It's unreliable because how do you specify the load order? And how 
does the user relate that to the source semantics?


grauzone suggested this earlier:
static this {} //full dependencies (all import statements)
static this : a, b {} //only dependent from module a and b
static this : void {} //no dependencies at all


Such annotations tend to get seriously out of date and wrong as code 
evolves. For example, if A and B import each other:


-
--- A ---
import B;
int foo() { ... }
static this()
{
   ...
}

--- B ---
import A;
import C;
int x;
static this : C()
{
x = A.foo();
}

---

Now, this may work just fine, until module A gets updated at some point 
in the future to depend on its static constructor. Then, B.x will get 
some unpredictable value, depending on the order of initialization.


So, in general, annotations in one module that specify what happens in 
another module are bad maintenance bugs waiting to happen.


The solution is relatively robust and straightforward. Create a third 
module, AB. Module A and module B both import AB. Put the static 
constructors for both A and B in module AB. The order of initialization 
problem is robustly solved, and all the interdependencies of 
initialization of A and B are explicitly laid out in AB.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 22:41:47 +0300, Walter Bright  
wrote:

> Vladimir Panteleev wrote:
>> I don't know why it was decided to mark the contents of void[] as
>> "might have pointers". It makes no sense! Consider:
>
> [...]
>
>> 3) It's very rare in practice that the only pointer to your
>> object (which you still plan to access later) to be stored in a
>> void[]-allocated array!
>
> Rare or common, it still would be a nasty bug lurking to catch someone.  
> The default behavior in D should be to be correct code. Doing  
> potentially unsafe things to improve performance should require extra  
> effort - in this case it would be either using the gc function to mark  
> the memory as not containing pointers, or storing them as ubyte[]  
> instead.

I just realized that by "performance" you might have meant memory leaks. Well, 
sure, if you can say that my programs crashing every few hours due to running 
out of memory is a "performance" problem. I'm sorry to sound bitter, but this 
was the cause of much annoyance for my software's users. It took me to write a 
memory debugger to understand that no matter how much you chase void[]s with 
hasNoPointers, there will always be that one ~ which you overlooked.

As much as I try to look from an objective perspective, I don't see how a 
memory leak (and memory leaks in D usually mean that NO memory is being freed, 
except for small lucky objects not having bogus pointers to them) is a problem 
less significant than an obscure case that involves allocating a void[], 
storing a pointer in it and losing all other references to the object. In fact, 
I just searched the D documentation and I couldn't find a statement saying 
whether void[] are scanned by the GC or not. Enter mr. D-newbie, who wants to 
write his own network/compression/file-copying/etc. library/program and 
stumbles upon void[], the seemingly perfect abstract-binary-data-container type 
for the job... (which is exactly what happened with yours truly).

P.S. Not trying to push my point of view, but just trying to offer some 
perspective from someone who has been bit by this design choice...

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Sun, 31 May 2009 22:41:47 +0300, Walter Bright
 wrote:


Vladimir Panteleev wrote:

I don't know why it was decided to mark the contents of void[] as
 "might have pointers". It makes no sense! Consider:

[...]

3) It's very rare in practice that the only pointer to your 
object (which you still plan to access later) to be stored in a 
void[]-allocated array!

Rare or common, it still would be a nasty bug lurking to catch
someone. The default behavior in D should be to be correct code.
Doing potentially unsafe things to improve performance should
require extra effort - in this case it would be either using the gc
function to mark the memory as not containing pointers, or storing
them as ubyte[] instead.


This isn't about performance, this is about having one thousand casts
all over my code. It becomes a burden to cast everything to ubyte[]
when working with abstract binary data. For example, when building a
MIME multipart message with binary fields, every line needs to have a
cast in it - when we could have just used the ~= operator to append
to a void[].


Another alternative would be to allow implicitly casting arrays of any 
type to const(ubyte)[] which is always safe. But I think this is too 
much ado about nothing - you're avoiding the type system to start with, 
so use ubyte, insert a cast, and call it a day. If you have too many 
casts, the problem is most likely elsewhere so that argument I'm not buying.


Andrei


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread grauzone

3) It's very rare in practice that the only pointer to your object (which you 
still plan to access later) to be stored in a void[]-allocated array! Remember, 
the properties of memory regions are determined when the memory is allocated, 
so casting an array of structures to a void[] will not lose you that reference. 
You'd need to move your pointer to a void[]-array (which you need to allocate 
explicitly or, for example, concatenating your reference to the void[]), then 
drop the reference to your original structure, for this to happen.


void[] = can contain pointers
ubyte[] = can not contain pointers

void[] just wraps void*, which is a low level type and can contain 
anything. Because of that, the conservative GC needs to scan it for 
pointers. ubyte[], on the other hand, contains sequences of 8 bit 
integers. For untyped binary data, ubyte[] is the most correct type.


You want to send it over network or write it into a file? Use ubyte[]. 
The data will never contain any pointers. You want to play low level 
tricks, that involve copying around arbitrary memory contents (like 
boxing, see std.boxer)? Use void[].


I think that's a good way to distinguish it.

You shouldn't cast structs or any other types to ubyte[], because the 
memory representation of those type is highly platform specific. Structs 
can contain padding, integers are endian dependend... If you want to 
convert these to binary data, write a marshaller. You _never_ want to do 
direct casts, because they're simply unportable. If you do the cast, you 
have to know what you're doing.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 22:41:47 +0300, Walter Bright  
wrote:

> Vladimir Panteleev wrote:
>> I don't know why it was decided to mark the contents of void[] as
>> "might have pointers". It makes no sense! Consider:
>
> [...]
>
>> 3) It's very rare in practice that the only pointer to your
>> object (which you still plan to access later) to be stored in a
>> void[]-allocated array!
>
> Rare or common, it still would be a nasty bug lurking to catch someone.  
> The default behavior in D should be to be correct code. Doing  
> potentially unsafe things to improve performance should require extra  
> effort - in this case it would be either using the gc function to mark  
> the memory as not containing pointers, or storing them as ubyte[]  
> instead.

This isn't about performance, this is about having one thousand casts all over 
my code. It becomes a burden to cast everything to ubyte[] when working with 
abstract binary data. For example, when building a MIME multipart message with 
binary fields, every line needs to have a cast in it - when we could have just 
used the ~= operator to append to a void[].

Alternative solutions would be to have a second type (either new or one of the 
existing, e.g. ubyte[]) act as void[] (any array type casts to it implicitly) 
but not be scanned by the GC, but I doubt this is something you'll consider

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Denis Koroskin
On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
 wrote:
 
> I just went through a ~15000-line project and replaced most occurrences  
> of void[]. Now the project is an ugly mess of void[], ubyte[] and casts,  
> but at least it doesn't leak memory like crazy any more.
>  
> I don't know why it was decided to mark the contents of void[] as "might  
> have pointers". It makes no sense!
>  
 
FWIW, I also consider void[] as a storage for an arbitrary untyped binary data, 
and thus I believe GC shouldn't scan it.
Ignoring void[] arrays is a correct behavior in 99% of cases (and a bug in a 
rest), but improves application execution speed significantly.
 
While it is possible to prevent GC from scanning an arbitrary void[] array, 
there is no reasonable way to prevent it from scanning all arrays (without 
modifying GC code).
 
It is a breaking change, but not too late for D2.

++vote


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Denis Koroskin
On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
 wrote:

> I just went through a ~15000-line project and replaced most occurrences  
> of void[]. Now the project is an ugly mess of void[], ubyte[] and casts,  
> but at least it doesn't leak memory like crazy any more.
>
> I don't know why it was decided to mark the contents of void[] as "might  
> have pointers". It makes no sense!
>

FWIW, I also consider void[] as a storage for an arbitrary untyped binary data, 
and thus I believe GC shouldn't scan it.

While it is possible to prevent GC from scanning an arbitrary void[] array, 
there is no reasonable way to prevent it from scanning all arrays.

It is a breaking change, but may be changed for D2. In 99% it is a correct 
behavior (and a bug in a rest), but reduces application execution speed 
significantly.

++vote


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Walter Bright

Vladimir Panteleev wrote:

I don't know why it was decided to mark the contents of void[] as
"might have pointers". It makes no sense! Consider:


[...]


3) It's very rare in practice that the only pointer to your
object (which you still plan to access later) to be stored in a
void[]-allocated array!


Rare or common, it still would be a nasty bug lurking to catch someone. 
The default behavior in D should be to be correct code. Doing 
potentially unsafe things to improve performance should require extra 
effort - in this case it would be either using the gc function to mark 
the memory as not containing pointers, or storing them as ubyte[] instead.


Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
I just went through a ~15000-line project and replaced most occurrences of 
void[]. Now the project is an ugly mess of void[], ubyte[] and casts, but at 
least it doesn't leak memory like crazy any more.

I don't know why it was decided to mark the contents of void[] as "might have 
pointers". It makes no sense! Consider:

1) void[] has this wonderful, magical property that any array type implicitly 
casts to void[]. This makes it wonderful to use in libraries and functions that 
manipulate data with no regards to what it actually contains. Network 
libraries, compression libraries, etc. - right about anywhere where you'd use a 
void* and length in C++, a void[] is just and appropriate.
2) Despite that void[] is "typeless", you can still operate on it - namely, 
slice and concatenate them. Pass a void[] to a network send() function - how 
much did you send? Half the buffer? No problem, slice it away and store the 
rest - and no casts.
3) It's very rare in practice that the only pointer to your object (which you 
still plan to access later) to be stored in a void[]-allocated array! Remember, 
the properties of memory regions are determined when the memory is allocated, 
so casting an array of structures to a void[] will not lose you that reference. 
You'd need to move your pointer to a void[]-array (which you need to allocate 
explicitly or, for example, concatenating your reference to the void[]), then 
drop the reference to your original structure, for this to happen.

Here's a simple naive implementation of a buffer:

void[] buffer;
void queue(void[] data)
{
buffer ~= data;
}
...
queue([1,2,3][]);
queue("Hello, World!");

No casts! So simple and beautiful. However, should you use this pattern to work 
with larger amounts of data with a high entropy, the "minefield" effect will 
cause the GC to stop collecting most data. Sure, you can call 
std.gc.hasNoPointers, but you need to do it after every single concatenation... 
and it makes expressions with more than one concatenation unsafe.

I heard that Tango copies over the properties of arrays when they are 
reallocated, which helps but solves the problem only partially.

So, I ask you: is there actually code out there that depends on the way void[] 
works right now? I brought up this argument a year or so ago on IRC, and there 
were people who defended ferociously the current design using idealisms ("it 
should work like what it sounds like, it should contain any type" or something 
like that), but I've yet to see a practical argument.


P.S. How come the standard library doesn't have a simple function like this?

T[] toArray(T)(inout T data) { return (&data)[0..1]; }

It happens often that I need to get a slice of memory around an object's 
reference (for example to pass it to a function that takes a void[] :D), and 
typing (&x)[0..1] every time feels like a hack.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Source control for all dmd source

2009-05-31 Thread hasen

Leandro Lucarella wrote:

"Jérôme M. Berger", el 31 de mayo a las 19:03 me escribiste:

Leandro Lucarella wrote:

Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

Git is a bad choice because of its poor Windows support. Mercurial or 
Bazaar are much better in this regard.


That's a mith, Git is pretty much supported in Windows now. I know people
that uses it in a regular basis. See:
http://code.google.com/p/msysgit/



Yea, I use git on windows (msysgit), works like a charm.

+1 for git


Re: Source control for all dmd source

2009-05-31 Thread Leandro Lucarella
"Jérôme M. Berger", el 31 de mayo a las 19:03 me escribiste:
> Leandro Lucarella wrote:
> >Well, that's great to hear! As Robert said, please tell if you need any
> >help. Please, please, please consider using a distributed SCM (I think git
> >would be ideal but mercurial is good too). That would make merging and
> >branching a lot easier.
>   Git is a bad choice because of its poor Windows support. Mercurial or 
> Bazaar are much better in this regard.

That's a mith, Git is pretty much supported in Windows now. I know people
that uses it in a regular basis. See:
http://code.google.com/p/msysgit/

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

No debemos temer a la muerte, porque es la mejor recompensa de la vida.
-- Ren & Stimpy


Re: Compile-time generated code... not that nice

2009-05-31 Thread Steve Teale
Ary Borenszweig Wrote:

> I just realized that code generated at compile-time (with string mixins) 
> is hard (or impossible) to debug (I mean at runtime, not just at 
> compile-time). Do you think it's a big shortcomming of D? How can this 
> be solved?
> 
> Maybe the compiler can generate a file with the contents of a module 
> with mixins expanded, and use these files as the input to the compiler 
> and linker, so these can be used in the debugging process.

Ary,

And because often the only way to make code compile with either d1 or d2 is to 
use a mixin in the D2 version block, that means that it will be that D2 is a 
pain to debug.

What are you using for debugging? I'm still using writefln, or failing that, 
printf, but I would like to advance.

Steve



Re: visualization of language benchmarks

2009-05-31 Thread Jarrett Billingsley
On Sun, May 31, 2009 at 1:14 PM, Knud Soerensen
<4tuu4k...@sneakemail.com> wrote:
> Check the nice article on
>
> http://gmarceau.qc.ca/blog/2009/05/speed-size-and-dependability-of.html
>
>
> --
> Join me on
> CrowdNews  http://crowdnews.eu/users/addGuide/42/
> Facebook   http://www.facebook.com/profile.php?id=1198821880
> Linkedin   http://www.linkedin.com/pub/0/117/a54
> Mandala    http://www.mandala.dk/view-profile.php4?profileID=7660
>

It's things like this that make me want to get into visualization.
Great article!


visualization of language benchmarks

2009-05-31 Thread Knud Soerensen

Check the nice article on

http://gmarceau.qc.ca/blog/2009/05/speed-size-and-dependability-of.html


--
Join me on
CrowdNews  http://crowdnews.eu/users/addGuide/42/
Facebook   http://www.facebook.com/profile.php?id=1198821880
Linkedin   http://www.linkedin.com/pub/0/117/a54
Mandalahttp://www.mandala.dk/view-profile.php4?profileID=7660


Re: Source control for all dmd source

2009-05-31 Thread Jérôme M. Berger

Leandro Lucarella wrote:

Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

	Git is a bad choice because of its poor Windows support. Mercurial 
or Bazaar are much better in this regard.


Jerome
--
mailto:jeber...@free.fr
http://jeberger.free.fr
Jabber: jeber...@jabber.fr



signature.asc
Description: OpenPGP digital signature


Re: Source control for all dmd source

2009-05-31 Thread Andrei Alexandrescu

Leandro Lucarella wrote:

Walter Bright, el 30 de mayo a las 21:54 me escribiste:

Jason House wrote:

Over in D.anounce, the LDC devs said they would have an easier time
upgrading to newer dmd (fe) versions if the source was in source
control. Even if Walter is the only one with write access, it's still
be helpful. It's helpful for more than just the LDC folks; that's
just the most recent example.
How about it Walter? IIRC, Andrei has been encouraging you to do
this for awhile now.

It's a great idea, I just haven't gotten my act together yet.


Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

Thanks!



We should use Bartosz's Code Co-op!!!

Andrei


Re: Source control for all dmd source

2009-05-31 Thread Andrei Alexandrescu

Leandro Lucarella wrote:

Walter Bright, el 30 de mayo a las 21:54 me escribiste:

Jason House wrote:

Over in D.anounce, the LDC devs said they would have an easier time
upgrading to newer dmd (fe) versions if the source was in source
control. Even if Walter is the only one with write access, it's still
be helpful. It's helpful for more than just the LDC folks; that's
just the most recent example.
How about it Walter? IIRC, Andrei has been encouraging you to do
this for awhile now.

It's a great idea, I just haven't gotten my act together yet.


Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

Thanks!



We should use Bartsz's Code Co-op!!!

Andrei


Re: Source control for all dmd source

2009-05-31 Thread Leandro Lucarella
Walter Bright, el 30 de mayo a las 21:54 me escribiste:
> Jason House wrote:
> >Over in D.anounce, the LDC devs said they would have an easier time
> >upgrading to newer dmd (fe) versions if the source was in source
> >control. Even if Walter is the only one with write access, it's still
> >be helpful. It's helpful for more than just the LDC folks; that's
> >just the most recent example.
> >How about it Walter? IIRC, Andrei has been encouraging you to do
> >this for awhile now.
> 
> It's a great idea, I just haven't gotten my act together yet.

Well, that's great to hear! As Robert said, please tell if you need any
help. Please, please, please consider using a distributed SCM (I think git
would be ideal but mercurial is good too). That would make merging and
branching a lot easier.

Thanks!

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

¿Qué será lo que hace que una brújula siempre marque el norte?
- Ser aguja, nada más, y cumplir su misión.
-- Ricardo Vaporeso


Re: Source control for all dmd source

2009-05-31 Thread Leandro Lucarella
Walter Bright, el 30 de mayo a las 21:56 me escribiste:
> Sean Kelly wrote:
> >It would be nice to have DMD in version control, but I don't buy the LDC 
> >argument.  It's trivial to diff one release against another, regardless of 
> >whether 
> >version control is involved.
> 
> A fantastic program to do this is meld on Ubuntu.
> 
> sudo apt-get install meld
> 
> It has done a lot for my productivity when merging diffs.

Meld is really nice, but it doesn't do semantic analysis to tell you why
things have changes and what change is for what =)

Meld can be used as a paliative for not having (at least) the frontend in
a SCM and doing well documented, self contained commits; not as
a solution.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

Hey you, out there beyond the wall,
Breaking bottles in the hall,
Can you help me?


Re: Source control for all dmd source

2009-05-31 Thread Robert Clipsham

Walter Bright wrote:

Jason House wrote:

Over in D.anounce, the LDC devs said they would have an easier time
upgrading to newer dmd (fe) versions if the source was in source
control. Even if Walter is the only one with write access, it's still
be helpful. It's helpful for more than just the LDC folks; that's
just the most recent example.

How about it Walter? IIRC, Andrei has been encouraging you to do
this for awhile now.


It's a great idea, I just haven't gotten my act together yet.


I'm sure there are a number of people that would be willing to set it up 
for you (including myself), just say the word!