Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread Vladimir Panteleev
On Thu, 04 Jun 2009 21:31:07 +0300, Denis Koroskin 2kor...@gmail.com  
wrote:


On Thu, 04 Jun 2009 22:16:42 +0400, Vladimir Panteleev  
thecybersha...@gmail.com wrote:


On Thu, 04 Jun 2009 05:10:17 +0300, Christopher Wright  
dhase...@gmail.com wrote:



bearophile wrote:

Christopher Wright:

Another good point. Or how do you index it by byte?
 How can you read  write files of 3 bytes if voids are 4 bytes long  
chunks? :o) I don't understand. I want to read and write files  
byte-by-byte.

 Bye,
bearophile


Vladimir was suggesting that void[] be the same as ubyte[] and that  
you use void*[] if you might include a pointer. So that use case would  
be safe.


Actually, I think Andrei's idea is better (to allow implicit casting  
arrays of non-reference types to const(ubyte)[]). It introduces an  
abstract no-pointers type, but still allows implicit casting to might  
have pointers.




There is a pitfall: should an arrays of non-reference types be  
implicitly castable to const(byte)[] or const(ubyte[])[] ?


Should const(byte)[] also be implicitly castable to const(ubyte)[] (or  
vice versa)?


I don't see why you'd want to work with arrays of signed bytes. It doesn't  
make sense to allow implicit casting between the two; the programmer  
should just pick one and stick with it. I think unsigned makes more sense.


--
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread BCS

Hello Vladimir,


I don't see why you'd want to work with arrays of signed bytes.


I can think of a number of cases where I would expect numbers to be in a 
range like [-20,+20], for instance, delta of small integral value or golf 
scores relative to par.





Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread Vladimir Panteleev

On Fri, 05 Jun 2009 10:15:11 +0300, BCS n...@anon.com wrote:


Hello Vladimir,


I don't see why you'd want to work with arrays of signed bytes.


I can think of a number of cases where I would expect numbers to be in a  
range like [-20,+20], for instance, delta of small integral value or  
golf scores relative to par.


Yes, but how is this related to abstracting data types to a generic type  
that can be used for stuff like buffering or networking?


--
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread BCS

Hello Vladimir,


On Fri, 05 Jun 2009 10:15:11 +0300, BCS n...@anon.com wrote:


Hello Vladimir,


I don't see why you'd want to work with arrays of signed bytes.


I can think of a number of cases where I would expect numbers to be
in a  range like [-20,+20], for instance, delta of small integral
value or  golf scores relative to par.


Yes, but how is this related to abstracting data types to a generic
type  that can be used for stuff like buffering or networking?



It's not and that's the point. The point is there are uses for 8-bit signed 
integer values other than as raw data. I might have read your comment out 
of context but it seemed you were saying there is no use for the signed byte 
type.





Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread Vladimir Panteleev

On Fri, 05 Jun 2009 20:16:08 +0300, BCS n...@anon.com wrote:


Hello Vladimir,


On Fri, 05 Jun 2009 10:15:11 +0300, BCS n...@anon.com wrote:


Hello Vladimir,


I don't see why you'd want to work with arrays of signed bytes.


I can think of a number of cases where I would expect numbers to be
in a  range like [-20,+20], for instance, delta of small integral
value or  golf scores relative to par.


Yes, but how is this related to abstracting data types to a generic
type  that can be used for stuff like buffering or networking?



It's not and that's the point. The point is there are uses for 8-bit  
signed integer values other than as raw data. I might have read your  
comment out of context but it seemed you were saying there is no use for  
the signed byte type.


Oh yes; I was definitely not suggesting removing byte[] from the language.  
insidejoke namespace=#dI'm sure he wouldn't be pleased one bit if we  
did that! :P/insidejoke


--
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-05 Thread Derek Parnell
On Fri, 5 Jun 2009 07:15:11 + (UTC), BCS wrote:

 Hello Vladimir,
 
 I don't see why you'd want to work with arrays of signed bytes.
 
 I can think of a number of cases where I would expect numbers to be in a 
 range like [-20,+20], for instance, delta of small integral value or golf 
 scores relative to par.

Or sound wave sample points [-127, 127] 

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell


Re: Why are void[] contents marked as having pointers?

2009-06-04 Thread Daniel Keep


Christopher Wright wrote:
 bearophile wrote:
 Christopher Wright:
 Another good point. Or how do you index it by byte?

 How can you read  write files of 3 bytes if voids are 4 bytes long
 chunks? :o) I don't understand. I want to read and write files
 byte-by-byte.

 Bye,
 bearophile
 
 Vladimir was suggesting that void[] be the same as ubyte[] and that you
 use void*[] if you might include a pointer. So that use case would be safe.

How would you generically store the bits of this, then?

struct Gotcha { void* ptr; ubyte boo; }


Re: Why are void[] contents marked as having pointers?

2009-06-04 Thread Vladimir Panteleev
On Thu, 04 Jun 2009 05:10:17 +0300, Christopher Wright  
dhase...@gmail.com wrote:



bearophile wrote:

Christopher Wright:

Another good point. Or how do you index it by byte?
 How can you read  write files of 3 bytes if voids are 4 bytes long  
chunks? :o) I don't understand. I want to read and write files  
byte-by-byte.

 Bye,
bearophile


Vladimir was suggesting that void[] be the same as ubyte[] and that you  
use void*[] if you might include a pointer. So that use case would be  
safe.


Actually, I think Andrei's idea is better (to allow implicit casting  
arrays of non-reference types to const(ubyte)[]). It introduces an  
abstract no-pointers type, but still allows implicit casting to might  
have pointers.


--
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-04 Thread Denis Koroskin
On Thu, 04 Jun 2009 22:16:42 +0400, Vladimir Panteleev  
thecybersha...@gmail.com wrote:


On Thu, 04 Jun 2009 05:10:17 +0300, Christopher Wright  
dhase...@gmail.com wrote:



bearophile wrote:

Christopher Wright:

Another good point. Or how do you index it by byte?
 How can you read  write files of 3 bytes if voids are 4 bytes long  
chunks? :o) I don't understand. I want to read and write files  
byte-by-byte.

 Bye,
bearophile


Vladimir was suggesting that void[] be the same as ubyte[] and that you  
use void*[] if you might include a pointer. So that use case would be  
safe.


Actually, I think Andrei's idea is better (to allow implicit casting  
arrays of non-reference types to const(ubyte)[]). It introduces an  
abstract no-pointers type, but still allows implicit casting to might  
have pointers.




There is a pitfall: should an arrays of non-reference types be  
implicitly castable to const(byte)[] or const(ubyte[])[] ?


Should const(byte)[] also be implicitly castable to const(ubyte)[] (or  
vice versa)?


Re: Why are void[] contents marked as having pointers?

2009-06-03 Thread Christopher Wright

Jarrett Billingsley wrote:

On Tue, Jun 2, 2009 at 7:11 PM, Christopher Wright dhase...@gmail.com wrote:

Vladimir Panteleev wrote:

I wasn't suggesting any GC modifications, I was just suggesting that
void[]'s TypeInfo has pointers flag be set to false.

The suggestion was that void[] be used as ubyte[] currently is, and then to
use void*[] to indicate an array of unknown type that may have pointers.


How do you have a void*[] point to a block of memory that is not a
multiple of (void*).sizeof?


Another good point. Or how do you index it by byte?


Re: Why are void[] contents marked as having pointers?

2009-06-03 Thread bearophile
Christopher Wright:
 Another good point. Or how do you index it by byte?

How can you read  write files of 3 bytes if voids are 4 bytes long chunks? :o) 
I don't understand. I want to read and write files byte-by-byte.

Bye,
bearophile


Re: Why are void[] contents marked as having pointers?

2009-06-03 Thread MLT
Walter Bright Wrote:

 Vladimir Panteleev wrote:
  I don't know why it was decided to mark the contents of void[] as
  might have pointers. It makes no sense! Consider:
 
 [...]
 
  3) It's very rare in practice that the only pointer to your
  object (which you still plan to access later) to be stored in a
  void[]-allocated array!
 
 Rare or common, it still would be a nasty bug lurking to catch someone. 
 The default behavior in D should be to be correct code. Doing 
 potentially unsafe things to improve performance should require extra 
 effort - in this case it would be either using the gc function to mark 
 the memory as not containing pointers, or storing them as ubyte[] instead.

As quite a newby, I can sum up what I understood as follows:

1. The idea of void[] is that you can put anything in it without casting. 
2. Because of this, you might put pointers in a void[].
3. Since you have legitimately stored pointers, and we don't want to have the 
GC throw away something that we still have valid pointers for, we have to have 
the GC scan over void[] arrays for possible hits.

4. This pretty much means that any big(*) D program can not afford to put 
uniformly distributed data in a void[] array, because the GC will stop working 
correctly - it will not dispose of stuff that you don't need any more.
(*) where big means a program that creates and destroys a lot of objects.

So, currently if you want to use void[] to store non-pointers, you need to use 
the gc function to mark the memory as not containing pointers.

A comment and a question. I agree that suddenly losing data because you stored 
a pointer in a void[] is worse than GC not working well. However, since GC in D 
is so automatic, almost any use of void[] to store non-pointer data will cause 
massive memory leaks and eventual program failure. 

I can see 4 solutions...

First, to not allow non-pointers to be stored in void[]. So non-pointers are 
stored in ubyte[], pointers in void[]. Kinda looses the main point of using 
void[].

Second, void[] is not scanned by GC, but you can mark it to be. This can cause 
bugs if you store a pointer in void[], and later retreive it, but don't mark 
correctly.

Third, void[] is scanned by GC,  but you can mark it not to be. This can cause 
memory leaks if you store complex data in void[] in a big program, and don't 
handle GC marking correctly.

Forth - somewhat more complex. Since the compiler knows exactly when a pointer 
is stored in a void[] and when not, it would be possible to have the compiler 
handle all by itself, as long as the property of having to be scanned by GC is 
dirty - once a variable has it, any other that touches that variable gets the 
property.

Of these four solutions, the last 3 can still cause bugs if one stores both 
pointers and data in the same void[] array, no matter how the memory is marked, 
unless one does that marking on a very fine scale (is that possible?)

My conclusion from all this is either don't use void[], or only use void[] 
to store pointers if you don't want bugs in a valid program.




Re: Why are void[] contents marked as having pointers?

2009-06-03 Thread Christopher Wright

bearophile wrote:

Christopher Wright:

Another good point. Or how do you index it by byte?


How can you read  write files of 3 bytes if voids are 4 bytes long chunks? :o) 
I don't understand. I want to read and write files byte-by-byte.

Bye,
bearophile


Vladimir was suggesting that void[] be the same as ubyte[] and that you 
use void*[] if you might include a pointer. So that use case would be safe.


Re: Why are void[] contents marked as having pointers?

2009-06-03 Thread Christopher Wright

MLT wrote:

Walter Bright Wrote:


Vladimir Panteleev wrote:

I don't know why it was decided to mark the contents of void[] as
might have pointers. It makes no sense! Consider:

[...]


3) It's very rare in practice that the only pointer to your
object (which you still plan to access later) to be stored in a
void[]-allocated array!
Rare or common, it still would be a nasty bug lurking to catch someone. 
The default behavior in D should be to be correct code. Doing 
potentially unsafe things to improve performance should require extra 
effort - in this case it would be either using the gc function to mark 
the memory as not containing pointers, or storing them as ubyte[] instead.


As quite a newby, I can sum up what I understood as follows:

1. The idea of void[] is that you can put anything in it without casting. 
2. Because of this, you might put pointers in a void[].

3. Since you have legitimately stored pointers, and we don't want to have the 
GC throw away something that we still have valid pointers for, we have to have the GC 
scan over void[] arrays for possible hits.

4. This pretty much means that any big(*) D program can not afford to put 
uniformly distributed data in a void[] array, because the GC will stop working correctly 
- it will not dispose of stuff that you don't need any more.
(*) where big means a program that creates and destroys a lot of objects.

So, currently if you want to use void[] to store non-pointers, you need to use 
the gc function to mark the memory as not containing pointers.

A comment and a question. I agree that suddenly losing data because you stored a pointer in a void[] is worse than GC not working well. However, since GC in D is so automatic, almost any use of void[] to store non-pointer data will cause massive memory leaks and eventual program failure. 


First, this is no problem if you are merely aliasing an existing array. 
In order for it to be an issue, you must copy from some array to a 
void[] -- for instance, appending to an existing void[], or .dup'ing a 
void[] alias. (While a GC could work around the latter case, it would be 
unsafe -- you can append something with pointers to a void[] copy of an 
int[].)



I can see 4 solutions...

First, to not allow non-pointers to be stored in void[]. So non-pointers are 
stored in ubyte[], pointers in void[]. Kinda looses the main point of using 
void[].

Second, void[] is not scanned by GC, but you can mark it to be. This can cause 
bugs if you store a pointer in void[], and later retreive it, but don't mark 
correctly.


This is an unsafe option.


Third, void[] is scanned by GC,  but you can mark it not to be. This can cause 
memory leaks if you store complex data in void[] in a big program, and don't 
handle GC marking correctly.


This is already available. If you know your array doesn't have pointers, 
you can call GC.hasNoPointers(array.ptr).


This is a safe option.


Forth - somewhat more complex. Since the compiler knows exactly when a pointer 
is stored in a void[] and when not, it would be possible to have the compiler 
handle all by itself, as long as the property of having to be scanned by GC is 
dirty - once a variable has it, any other that touches that variable gets the 
property.


This isn't really the case unless you get some really invasive whole 
program analysis (not available with D's compilation model, or if you 
want to interact with code written in other languages, or if you want to 
do runtime dynamic linking) or a really invasive runtime (think of 
calling a method every time you access an array).


In point of fact, that's not going to be enough. You need to call the 
runtime with every assignment, since you might be passing individual 
ubytes around when they're part of a pointer and reassembling them 
somewhere else.



Of these four solutions, the last 3 can still cause bugs if one stores both 
pointers and data in the same void[] array, no matter how the memory is marked, 
unless one does that marking on a very fine scale (is that possible?)


struct S
{
int i;
int* j;
}

You're screwed.


My conclusion from all this is either don't use void[], or only use void[] to 
store pointers if you don't want bugs in a valid program.


Not bugs, but potential performance issues. And the advice should be 
don't allocate void[], to split hairs.


Re: Why are void[] contents marked as having pointers?

2009-06-02 Thread Vladimir Panteleev
On Tue, 02 Jun 2009 01:01:00 +0300, Christopher Wright dhase...@gmail.com 
wrote:

 Vladimir Panteleev wrote:
 On Mon, 01 Jun 2009 14:10:57 +0300, Christopher Wright  
 dhase...@gmail.com wrote:

 Vladimir Panteleev wrote:
 On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright   
 dhase...@gmail.com wrote:

 Vladimir Panteleev wrote:
 std.boxer is actually a valid counter-example for my post.
 The specific fix is simple: replace the void[] with void*[].
 The generic fix is just to add a line to
 http://www.digitalmars.com/d/garbage.html adding that hiding your   
 only  reference in a void[] results in undefined behavior. I don't   
 think this  should be an inconvenience to any projects?
 What do you use for may contain unaligned pointers?
  Sorry, what do you mean? I don't understand why such a type is  
 needed?  Implementing support for scanning memory ranges for  
 unaligned pointers  will slow down the GC even more.
 Because you can have a struct with align(1) that contains pointers.  
 Then  these pointers can be unaligned. Then an array of those structs  
 cast to  a void*[] would contain pointers, but as an optimization, the  
 GC would  consider the pointers in this array aligned because you tell  
 it they are.
  The GC will not see unaligned pointers, regardless if they're in a  
 struct or void[] array. The GC doesn't know the type of the data it's  
 scanning - it just knows if it might contain pointers or it definitely  
 doesn't contain pointers.

 Okay, so currently the GC doesn't do anything interesting with its type  
 information. You're suggesting that that be enforced and codified.

I wasn't suggesting any GC modifications, I was just suggesting that void[]'s 
TypeInfo has pointers flag be set to false.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-02 Thread Christopher Wright

Vladimir Panteleev wrote:

I wasn't suggesting any GC modifications, I was just suggesting that void[]'s TypeInfo 
has pointers flag be set to false.


The suggestion was that void[] be used as ubyte[] currently is, and then 
to use void*[] to indicate an array of unknown type that may have pointers.


This works when all pointers are aligned, or when the garbage collector 
does not optimize in cases where a type is known not to contain 
unaligned pointers.


Alternatively, you can change the runtime to notify the GC on array 
copies so it can keep track of type information when you're avoiding the 
type system. But it's so easy to get around this by accident, it's not a 
reasonable solution (even if it could be made fast).


Re: Why are void[] contents marked as having pointers?

2009-06-02 Thread Jarrett Billingsley
On Tue, Jun 2, 2009 at 7:11 PM, Christopher Wright dhase...@gmail.com wrote:
 Vladimir Panteleev wrote:

 I wasn't suggesting any GC modifications, I was just suggesting that
 void[]'s TypeInfo has pointers flag be set to false.

 The suggestion was that void[] be used as ubyte[] currently is, and then to
 use void*[] to indicate an array of unknown type that may have pointers.

How do you have a void*[] point to a block of memory that is not a
multiple of (void*).sizeof?


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 02:21:33 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 To argue that convincingly, you'd need to disable conversions from  
 arrays of class objects to void[].

You're right. Perhaps implicit cast of reference types to void[] should result 
in an error.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright dhase...@gmail.com 
wrote:

 Vladimir Panteleev wrote:
 std.boxer is actually a valid counter-example for my post.
 The specific fix is simple: replace the void[] with void*[].
 The generic fix is just to add a line to  
 http://www.digitalmars.com/d/garbage.html adding that hiding your only  
 reference in a void[] results in undefined behavior. I don't think this  
 should be an inconvenience to any projects?

 What do you use for may contain unaligned pointers?

Sorry, what do you mean? I don't understand why such a type is needed? 
Implementing support for scanning memory ranges for unaligned pointers will 
slow down the GC even more.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Daniel Keep


Vladimir Panteleev wrote:
 On Mon, 01 Jun 2009 02:21:33 +0300, Andrei Alexandrescu 
 seewebsiteforem...@erdani.org wrote:
 
 To argue that convincingly, you'd need to disable conversions from  
 arrays of class objects to void[].
 
 You're right. Perhaps implicit cast of reference types to void[] should 
 result in an error.

If only there were a way to indicate that void[]s could contain
pointers, then they would behave uniformly across types...

Oh wait.


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Vladimir Panteleev
On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 Another alternative would be to allow implicitly casting arrays of any  
 type to const(ubyte)[] which is always safe. But I think this is too  
 much ado about nothing - you're avoiding the type system to start with,  
 so use ubyte, insert a cast, and call it a day. If you have too many  
 casts, the problem is most likely elsewhere so that argument I'm not  
 buying.

I've thought about this for a bit. If we allow any *non-reference* type except 
void[] to implicitly cast to ubyte[], but still allow implicitly casting 
ubyte[] to void[], it will put ubyte[] in the perfect spot in the type 
hierarchy - it'll allow safely (portability issues notwithstanding) getting the 
representation of value-type (POD) arrays, while still allowing abstracting it 
even further to the might have pointers type - at which point it is unsafe to 
access individual bytes, which void[] disallows without casts.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Christopher Wright

Vladimir Panteleev wrote:

On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright dhase...@gmail.com 
wrote:


Vladimir Panteleev wrote:

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic fix is just to add a line to  
http://www.digitalmars.com/d/garbage.html adding that hiding your only  
reference in a void[] results in undefined behavior. I don't think this  
should be an inconvenience to any projects?

What do you use for may contain unaligned pointers?


Sorry, what do you mean? I don't understand why such a type is needed? 
Implementing support for scanning memory ranges for unaligned pointers will 
slow down the GC even more.


Because you can have a struct with align(1) that contains pointers. Then 
these pointers can be unaligned. Then an array of those structs cast to 
a void*[] would contain pointers, but as an optimization, the GC would 
consider the pointers in this array aligned because you tell it they are.


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 02:18:46 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 Vladimir Panteleev wrote:
 On Mon, 01 Jun 2009 00:00:45 +0300, Andrei Alexandrescu  
 seewebsiteforem...@erdani.org wrote:

 const(ubyte)[] getRepresentation(T)(T[] data)
 {
  return cast(typeof(return)) data;
 }
  This is functionally equivalent to (forgive the D1):
 ubyte[] getRepresentation(void[] data)
 {
  return cast(ubyte[]) data;
 }
 Since no allocation is done in this case, the use of void[] is safe,  
 and it doesn't instantiate a version of the function for every type you  
 call it with. I remarked about this in my other reply.


Which is why I wrote forgive the D1 :)
I've yet to switch to D2, but it's obvious that the const should be there to 
ensure safety.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 14:10:57 +0300, Christopher Wright dhase...@gmail.com 
wrote:

 Vladimir Panteleev wrote:
 On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright  
 dhase...@gmail.com wrote:

 Vladimir Panteleev wrote:
 std.boxer is actually a valid counter-example for my post.
 The specific fix is simple: replace the void[] with void*[].
 The generic fix is just to add a line to   
 http://www.digitalmars.com/d/garbage.html adding that hiding your  
 only  reference in a void[] results in undefined behavior. I don't  
 think this  should be an inconvenience to any projects?
 What do you use for may contain unaligned pointers?
  Sorry, what do you mean? I don't understand why such a type is needed?  
 Implementing support for scanning memory ranges for unaligned pointers  
 will slow down the GC even more.

 Because you can have a struct with align(1) that contains pointers. Then  
 these pointers can be unaligned. Then an array of those structs cast to  
 a void*[] would contain pointers, but as an optimization, the GC would  
 consider the pointers in this array aligned because you tell it they are.

The GC will not see unaligned pointers, regardless if they're in a struct or 
void[] array. The GC doesn't know the type of the data it's scanning - it just 
knows if it might contain pointers or it definitely doesn't contain pointers.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-06-01 Thread Christopher Wright

Vladimir Panteleev wrote:

On Mon, 01 Jun 2009 14:10:57 +0300, Christopher Wright dhase...@gmail.com 
wrote:


Vladimir Panteleev wrote:
On Mon, 01 Jun 2009 05:28:39 +0300, Christopher Wright  
dhase...@gmail.com wrote:



Vladimir Panteleev wrote:

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic fix is just to add a line to   
http://www.digitalmars.com/d/garbage.html adding that hiding your  
only  reference in a void[] results in undefined behavior. I don't  
think this  should be an inconvenience to any projects?

What do you use for may contain unaligned pointers?
 Sorry, what do you mean? I don't understand why such a type is needed?  
Implementing support for scanning memory ranges for unaligned pointers  
will slow down the GC even more.
Because you can have a struct with align(1) that contains pointers. Then  
these pointers can be unaligned. Then an array of those structs cast to  
a void*[] would contain pointers, but as an optimization, the GC would  
consider the pointers in this array aligned because you tell it they are.


The GC will not see unaligned pointers, regardless if they're in a struct or 
void[] array. The GC doesn't know the type of the data it's scanning - it just knows if 
it might contain pointers or it definitely doesn't contain pointers.


Okay, so currently the GC doesn't do anything interesting with its type 
information. You're suggesting that that be enforced and codified.


Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
I just went through a ~15000-line project and replaced most occurrences of 
void[]. Now the project is an ugly mess of void[], ubyte[] and casts, but at 
least it doesn't leak memory like crazy any more.

I don't know why it was decided to mark the contents of void[] as might have 
pointers. It makes no sense! Consider:

1) void[] has this wonderful, magical property that any array type implicitly 
casts to void[]. This makes it wonderful to use in libraries and functions that 
manipulate data with no regards to what it actually contains. Network 
libraries, compression libraries, etc. - right about anywhere where you'd use a 
void* and length in C++, a void[] is just and appropriate.
2) Despite that void[] is typeless, you can still operate on it - namely, 
slice and concatenate them. Pass a void[] to a network send() function - how 
much did you send? Half the buffer? No problem, slice it away and store the 
rest - and no casts.
3) It's very rare in practice that the only pointer to your object (which you 
still plan to access later) to be stored in a void[]-allocated array! Remember, 
the properties of memory regions are determined when the memory is allocated, 
so casting an array of structures to a void[] will not lose you that reference. 
You'd need to move your pointer to a void[]-array (which you need to allocate 
explicitly or, for example, concatenating your reference to the void[]), then 
drop the reference to your original structure, for this to happen.

Here's a simple naive implementation of a buffer:

void[] buffer;
void queue(void[] data)
{
buffer ~= data;
}
...
queue([1,2,3][]);
queue(Hello, World!);

No casts! So simple and beautiful. However, should you use this pattern to work 
with larger amounts of data with a high entropy, the minefield effect will 
cause the GC to stop collecting most data. Sure, you can call 
std.gc.hasNoPointers, but you need to do it after every single concatenation... 
and it makes expressions with more than one concatenation unsafe.

I heard that Tango copies over the properties of arrays when they are 
reallocated, which helps but solves the problem only partially.

So, I ask you: is there actually code out there that depends on the way void[] 
works right now? I brought up this argument a year or so ago on IRC, and there 
were people who defended ferociously the current design using idealisms (it 
should work like what it sounds like, it should contain any type or something 
like that), but I've yet to see a practical argument.


P.S. How come the standard library doesn't have a simple function like this?

T[] toArray(T)(inout T data) { return (data)[0..1]; }

It happens often that I need to get a slice of memory around an object's 
reference (for example to pass it to a function that takes a void[] :D), and 
typing (x)[0..1] every time feels like a hack.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Walter Bright

Vladimir Panteleev wrote:

I don't know why it was decided to mark the contents of void[] as
might have pointers. It makes no sense! Consider:


[...]


3) It's very rare in practice that the only pointer to your
object (which you still plan to access later) to be stored in a
void[]-allocated array!


Rare or common, it still would be a nasty bug lurking to catch someone. 
The default behavior in D should be to be correct code. Doing 
potentially unsafe things to improve performance should require extra 
effort - in this case it would be either using the gc function to mark 
the memory as not containing pointers, or storing them as ubyte[] instead.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello grauzone,


You shouldn't cast structs or any other types to ubyte[], because the
memory representation of those type is highly platform specific.
Structs can contain padding, integers are endian dependend... If you
want to convert these to binary data, write a marshaller. You _never_
want to do direct casts, because they're simply unportable. If you do
the cast, you have to know what you're doing.



Never say never. Some cases like tmp files or whatnot where the same exe 
will save and load the file never* have any need for potability.


*never uses intentionally :b.




Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Vladimir,


I just went through a ~15000-line project and replaced most
occurrences of void[]. Now the project is an ugly mess of void[],
ubyte[] and casts, but at least it doesn't leak memory like crazy any
more.

I don't know why it was decided to mark the contents of void[] as
might have pointers. It makes no sense! Consider:

2) Despite that void[] is typeless, you can still operate on it -
namely, slice and concatenate them. Pass a void[] to a network send()
function - how much did you send? Half the buffer? No problem, slice
it away and store the rest - and no casts.

3) It's very rare in practice that the only pointer to your object
(which you still plan to access later) to be stored in a
void[]-allocated array! Remember, the properties of memory regions are
determined when the memory is allocated, so casting an array of
structures to a void[] will not lose you that reference. You'd need to
move your pointer to a void[]-array (which you need to allocate
explicitly or, for example, concatenating your reference to the
void[]), then drop the reference to your original structure, for this
to happen.



I think the idea is that void[] is the most general data type; it can be 
anything, including pointers. 

Also for a real world use case where void[]=mightHavePointers is valid, consider 
a system that reads blocks of data structures from a file and then does in 
place substation from file references to memory references. You can't allocate 
buffers of the correct type because you may not even know what that is until 
you have already loaded the data.




Here's a simple naive implementation of a buffer:

void[] buffer;
void queue(void[] data)
{
buffer ~= data;
}
...
queue([1,2,3][]);
queue(Hello, World!);
No casts! So simple and beautiful. However, should you use this
pattern to work with larger amounts of data with a high entropy, the
minefield effect will cause the GC to stop collecting most data.
Sure, you can call std.gc.hasNoPointers, but you need to do it after
every single concatenation... and it makes expressions with more than
one concatenation unsafe.


Yes, when data is being copied into void[] from another type[] it is reasonable 
to ignore pointers but as above, going the other way (IMHO the /common/ case) 
it's not so easy.




I heard that Tango copies over the properties of arrays when they are
reallocated, which helps but solves the problem only partially.

So, I ask you: is there actually code out there that depends on the
way void[] works right now? I brought up this argument a year or so
ago on IRC, and there were people who defended ferociously the current
design using idealisms (it should work like what it sounds like, it
should contain any type or something like that), but I've yet to see
a practical argument.


I think that void[] should be left as is but I'm almost ready to throw in 
with the idea that we **need** another type that has the no-cast parts of 
void[] but assume no pointers as well.





Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Denis Koroskin
On Mon, 01 Jun 2009 00:53:02 +0400, BCS n...@anon.com wrote:

 Hello Vladimir,

 I just went through a ~15000-line project and replaced most
 occurrences of void[]. Now the project is an ugly mess of void[],
 ubyte[] and casts, but at least it doesn't leak memory like crazy any
 more.
  I don't know why it was decided to mark the contents of void[] as
 might have pointers. It makes no sense! Consider:
  2) Despite that void[] is typeless, you can still operate on it -
 namely, slice and concatenate them. Pass a void[] to a network send()
 function - how much did you send? Half the buffer? No problem, slice
 it away and store the rest - and no casts.
  3) It's very rare in practice that the only pointer to your object
 (which you still plan to access later) to be stored in a
 void[]-allocated array! Remember, the properties of memory regions are
 determined when the memory is allocated, so casting an array of
 structures to a void[] will not lose you that reference. You'd need to
 move your pointer to a void[]-array (which you need to allocate
 explicitly or, for example, concatenating your reference to the
 void[]), then drop the reference to your original structure, for this
 to happen.


 I think the idea is that void[] is the most general data type; it can be  
 anything, including pointers.  
 Also for a real world use case where void[]=mightHavePointers is valid,  
 consider a system that reads blocks of data structures from a file and  
 then does in place substation from file references to memory references.  
 You can't allocate buffers of the correct type because you may not even  
 know what that is until you have already loaded the data.


In this case you should *explicitly* mark that void[] array as 
mightHavePointers.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

BCS wrote:

so use ubyte, insert a cast, and call it a day. If you have too
many casts, the problem is most likely elsewhere


You might be correct, but I don't think any of us have enough info right 
now to make that assertion.


Oh there is enough information. What's needed is:

const(ubyte)[] getRepresentation(T)(T[] data)
{
return cast(typeof(return)) data;
}

If you have many calls to getRepresentation(), then that 
anticlimatically shows that you need to look at arrays' representations 
often. If there are too many of those, maybe some of the said arrays 
should be dealt with as ubyte[] in the first place.



Andrei


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 23:11:57 +0300, grauzone n...@example.net wrote:

 3) It's very rare in practice that the only pointer to your object  
 (which you still plan to access later) to be stored in a  
 void[]-allocated array! Remember, the properties of memory regions are  
 determined when the memory is allocated, so casting an array of  
 structures to a void[] will not lose you that reference. You'd need to  
 move your pointer to a void[]-array (which you need to allocate  
 explicitly or, for example, concatenating your reference to the  
 void[]), then drop the reference to your original structure, for this  
 to happen.

 void[] = can contain pointers
 ubyte[] = can not contain pointers

 void[] just wraps void*, which is a low level type and can contain  
 anything. Because of that, the conservative GC needs to scan it for  
 pointers. ubyte[], on the other hand, contains sequences of 8 bit  
 integers. For untyped binary data, ubyte[] is the most correct type.

 You want to send it over network or write it into a file? Use ubyte[].  
 The data will never contain any pointers. You want to play low level  
 tricks, that involve copying around arbitrary memory contents (like  
 boxing, see std.boxer)? Use void[].

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic fix is just to add a line to 
http://www.digitalmars.com/d/garbage.html adding that hiding your only 
reference in a void[] results in undefined behavior. I don't think this should 
be an inconvenience to any projects?

 You shouldn't cast structs or any other types to ubyte[], because the  
 memory representation of those type is highly platform specific. Structs  
 can contain padding, integers are endian dependend... If you want to  
 convert these to binary data, write a marshaller. You _never_ want to do  
 direct casts, because they're simply unportable. If you do the cast, you  
 have to know what you're doing.

Thanks for the advice, but I actually know what I'm doing. Unlike C, D's 
structure alignment rules are actually part of the specification. If I wanted 
my programs to be safe/cross-platform/etc. regardless of execution speed, I'd 
use a scripting or VM-ed language.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Andrei,


BCS wrote:


so use ubyte, insert a cast, and call it a day. If you have too many
casts, the problem is most likely elsewhere


You might be correct, but I don't think any of us have enough info
right now to make that assertion.


Oh there is enough information. What's needed is:

const(ubyte)[] getRepresentation(T)(T[] data)
{
return cast(typeof(return)) data;
}
If you have many calls to getRepresentation(), then that
anticlimatically shows that you need to look at arrays'
representations often. If there are too many of those, maybe some of
the said arrays should be dealt with as ubyte[] in the first place.


Maybe in some cases but if the primary function of the code is processing 
stuff between raw data and other data types than the above is irrelevant. 
The OP sort of hinted somewhere that this is the kind of thing he is working 
on. Without knowing what the OP is doing, I still don't think we can say 
if his program is well designed.





Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 But I think this is too much ado about nothing - you're avoiding the type 
 system to start with, so use ubyte, insert a cast, and call it a day. 

I don't get it - not using casts is avoiding the type system? :P Note that I am 
NOT up-casting the void[] later back to some other type - it goes out to the 
network, a file, etc. void[] sounds like it fits perfectly in the type 
hierarchy for just a bunch of bytes, except for the may contain pointers 
fine print.

 If you have too many casts, the problem is most likely elsewhere so that 
 argument I'm not buying.

I could cut down on the number of casts if I were to replace most array 
appending operations to calls to a function that takes a void[] and then 
internally casts to an ubyte[] and appends that somewhere. There's a lot of 
diversity of types being worked with in my case - strings, various structs, 
more raw data, etc. I'm more annoyed that I'd need to do something like that to 
work around a design decision that may not have been fully thought out.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Walter Bright

Vladimir Panteleev wrote:

I just realized that by performance you might have meant memory
leaks.


No, in this context I meant improving performance by not scanning the 
void[] memory for pointers.



Well, sure, if you can say that my programs crashing every few
hours due to running out of memory is a performance problem. I'm
sorry to sound bitter, but this was the cause of much annoyance for
my software's users. It took me to write a memory debugger to
understand that no matter how much you chase void[]s with
hasNoPointers, there will always be that one ~ which you overlooked.


I'm curious what form of data you have that always seem to look like 
valid pointers. There are a couple other options you can pursue - moving 
the gc pool to another location in the address space, or changing the 
alignment of your void[] data so it won't look like aligned pointers 
(the gc won't look for misaligned pointers).


Or just use ubyte[] instead.


As much as I try to look from an objective perspective, I don't see
how a memory leak (and memory leaks in D usually mean that NO memory
is being freed, except for small lucky objects not having bogus
pointers to them) is a problem less significant than an obscure case
that involves allocating a void[], storing a pointer in it and losing
all other references to the object.


Because one is an obvious failure, and the other will be memory 
corruption. Memory corruption is pernicious and awful.



In fact, I just searched the D
documentation and I couldn't find a statement saying whether void[]
are scanned by the GC or not. Enter mr. D-newbie, who wants to write
his own network/compression/file-copying/etc. library/program and
stumbles upon void[], the seemingly perfect
abstract-binary-data-container type for the job... (which is exactly
what happened with yours truly).

P.S. Not trying to push my point of view, but just trying to offer
some perspective from someone who has been bit by this design
choice...


Hmm. Wouldn't compression data be naturally a ubyte[] type?



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:00:45 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

 const(ubyte)[] getRepresentation(T)(T[] data)
 {
  return cast(typeof(return)) data;
 }

This is functionally equivalent to (forgive the D1):
ubyte[] getRepresentation(void[] data)
{
return cast(ubyte[]) data;
}
Since no allocation is done in this case, the use of void[] is safe, and it 
doesn't instantiate a version of the function for every type you call it with. 
I remarked about this in my other reply.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread BCS

Hello Walter,


I'm curious what form of data you have that always seem to look like
valid pointers. There are a couple other options you can pursue -
moving the gc pool to another location in the address space, or
changing the alignment of your void[] data so it won't look like
aligned pointers (the gc won't look for misaligned pointers).



Most (but not all) of the cases I can think of where you get false pointers, 
re-aligning stuff or moving the heap won't help as the false pointer source 
will hit the full address space.





Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright newshou...@digitalmars.com 
wrote:

 Vladimir Panteleev wrote:
 I just realized that by performance you might have meant memory
 leaks.

 No, in this context I meant improving performance by not scanning the  
 void[] memory for pointers.

 Well, sure, if you can say that my programs crashing every few
 hours due to running out of memory is a performance problem. I'm
 sorry to sound bitter, but this was the cause of much annoyance for
 my software's users. It took me to write a memory debugger to
 understand that no matter how much you chase void[]s with
 hasNoPointers, there will always be that one ~ which you overlooked.

 I'm curious what form of data you have that always seem to look like  
 valid pointers. There are a couple other options you can pursue - moving  
 the gc pool to another location in the address space, or changing the  
 alignment of your void[] data so it won't look like aligned pointers  
 (the gc won't look for misaligned pointers).

It's just compressed data, which is evenly distributed across the 32-bit 
address space. Let's do the math:

Suppose we have an application which has two blocks of memory, M and N. Block M 
is a block with random data which is erroneously marked as having pointers, 
while block N is a block which shouldn't have any pointers towards it.
Now, the chance that a random DWORD will point inside N is 
sizeof(N)/0x1 - or rather, we can say that it will NOT point inside N 
with the probability of 1-(sizeof(N)/0x1). For as many DWORDs as there 
are in M, raise that to the power sizeof(M)/4. For values already as small as 1 
MB for M and N, it's pretty much guaranteed that you'll have pointers inside N. 
Relocating or re-aligning the data won't help - it won't affect the entropy or 
the value range.

 Or just use ubyte[] instead.

And the casts that come with it :(

 As much as I try to look from an objective perspective, I don't see
 how a memory leak (and memory leaks in D usually mean that NO memory
 is being freed, except for small lucky objects not having bogus
 pointers to them) is a problem less significant than an obscure case
 that involves allocating a void[], storing a pointer in it and losing
 all other references to the object.

 Because one is an obvious failure, and the other will be memory  
 corruption. Memory corruption is pernicious and awful.

It is, yes. But if you add don't put your only references inside void[]s to 
the don'ts on the GC page, the programmer will only have himself to blame for 
not reading the language documentations. This goes right along with other 
tricks IMHO.

 In fact, I just searched the D
 documentation and I couldn't find a statement saying whether void[]
 are scanned by the GC or not. Enter mr. D-newbie, who wants to write
 his own network/compression/file-copying/etc. library/program and
 stumbles upon void[], the seemingly perfect
 abstract-binary-data-container type for the job... (which is exactly
 what happened with yours truly).
  P.S. Not trying to push my point of view, but just trying to offer
 some perspective from someone who has been bit by this design
 choice...

 Hmm. Wouldn't compression data be naturally a ubyte[] type?

That's a subjective opinion :) I could just as well continue arguing that 
void[] is the perfect type for any kind of opaque binary data due to its 
properties.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright newshou...@digitalmars.com 
wrote:

 Because one is an obvious failure, and the other will be memory  
 corruption. Memory corruption is pernicious and awful.

I wanted to add that debugging memory corruptions and other memory problems for 
D right now is complicated due to lack of proper tools in this area. Hopefully 
this will change in the near future.

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Vladimir Panteleev
On Mon, 01 Jun 2009 00:28:21 +0300, Walter Bright newshou...@digitalmars.com 
wrote:

 Hmm. Wouldn't compression data be naturally a ubyte[] type?

(again, something I forgot to add... shouldn't hit Send so soon)

Consider this really basic example of file concatenation:

auto data = read(file1) ~ read(file2); // oops! void[] concatenation - 
minefield created

-- 
Best regards,
 Vladimir  mailto:thecybersha...@gmail.com


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread bearophile
Vladimir Panteleev:
 Consider this really basic example of file concatenation:
 auto data = read(file1) ~ read(file2); // oops! void[] concatenation - 
 minefield created

I think a better design for that read() function is to return ubyte[].
I have never understood why it returns a void[].
To manage generic data ubyte is better than void[] in your program (sometimes 
uint[] is useful to increase efficiency compared to ubyte[]).

Bye,
bearophile


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:

But I think this is too much ado about nothing - you're avoiding the type system to start with, so use ubyte, insert a cast, and call it a day. 


I don't get it - not using casts is avoiding the type system? :P Note that I am NOT up-casting the 
void[] later back to some other type - it goes out to the network, a file, etc. void[] sounds like 
it fits perfectly in the type hierarchy for just a bunch of bytes, except for the 
may contain pointers fine print.


If you have too many casts, the problem is most likely elsewhere so that 
argument I'm not buying.


I could cut down on the number of casts if I were to replace most array 
appending operations to calls to a function that takes a void[] and then 
internally casts to an ubyte[] and appends that somewhere. There's a lot of 
diversity of types being worked with in my case - strings, various structs, 
more raw data, etc. I'm more annoyed that I'd need to do something like that to 
work around a design decision that may not have been fully thought out.



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Mon, 01 Jun 2009 00:00:45 +0300, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:


const(ubyte)[] getRepresentation(T)(T[] data)
{
 return cast(typeof(return)) data;
}


This is functionally equivalent to (forgive the D1):
ubyte[] getRepresentation(void[] data)
{
return cast(ubyte[]) data;
}
Since no allocation is done in this case, the use of void[] is safe, and it 
doesn't instantiate a version of the function for every type you call it with. 
I remarked about this in my other reply.



This is not safe because you can change the data.

Andrei


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

On Sun, 31 May 2009 23:24:09 +0300, Andrei Alexandrescu
seewebsiteforem...@erdani.org wrote:


But I think this is too much ado about nothing - you're avoiding
the type system to start with, so use ubyte, insert a cast, and
call it a day.


I don't get it - not using casts is avoiding the type system? :P Note
that I am NOT up-casting the void[] later back to some other type -
it goes out to the network, a file, etc. void[] sounds like it fits
perfectly in the type hierarchy for just a bunch of bytes, except
for the may contain pointers fine print.


I understand. You are sending around object representation. void[] may 
contain pointers, so you're simply not looking at the right abstraction.



If you have too many casts, the problem is most likely elsewhere so
that argument I'm not buying.


I could cut down on the number of casts if I were to replace most
array appending operations to calls to a function that takes a void[]
and then internally casts to an ubyte[] and appends that somewhere.
There's a lot of diversity of types being worked with in my case -
strings, various structs, more raw data, etc. I'm more annoyed that
I'd need to do something like that to work around a design decision
that may not have been fully thought out.


Walter has written a class called OutBuffer (see std.outbuffer) the 
likes of which could be used to encapsulate representation marshaling.


Andrei



Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Lionello Lunesu

Denis Koroskin wrote:

On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
thecybersha...@gmail.com wrote:
 
I just went through a ~15000-line project and replaced most occurrences  
of void[]. Now the project is an ugly mess of void[], ubyte[] and casts,  
but at least it doesn't leak memory like crazy any more.
 
I don't know why it was decided to mark the contents of void[] as might  
have pointers. It makes no sense!
 
 
FWIW, I also consider void[] as a storage for an arbitrary untyped binary

 data, and thus I believe GC shouldn't scan it.

You're contradicting yourself there. void[] is arbitrary untyped data, 
so it could contain uints, floats, bytes, pointers, arrays, strings, 
etc. or structs with any of those.


I think the current behavior is correct: ubyte[] is the new void*.

I also agree that std.file.read (and similar functions) should return 
ubyte[] instead of void[], to prevent surprises after concatenation.


L.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Christopher Wright

Lionello Lunesu wrote:

Denis Koroskin wrote:
On Sun, 31 May 2009 22:45:23 +0400, Vladimir Panteleev 
thecybersha...@gmail.com wrote:
 
I just went through a ~15000-line project and replaced most 
occurrences  of void[]. Now the project is an ugly mess of void[], 
ubyte[] and casts,  but at least it doesn't leak memory like crazy 
any more.
 
I don't know why it was decided to mark the contents of void[] as 
might  have pointers. It makes no sense!
 
 
FWIW, I also consider void[] as a storage for an arbitrary untyped binary

  data, and thus I believe GC shouldn't scan it.

You're contradicting yourself there. void[] is arbitrary untyped data, 
so it could contain uints, floats, bytes, pointers, arrays, strings, 
etc. or structs with any of those.


I think the current behavior is correct: ubyte[] is the new void*.


Even in C, people often use unsigned char* for arbitrary data that does 
not include pointers.


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Christopher Wright

Vladimir Panteleev wrote:

std.boxer is actually a valid counter-example for my post.
The specific fix is simple: replace the void[] with void*[].
The generic fix is just to add a line to 
http://www.digitalmars.com/d/garbage.html adding that hiding your only reference in a 
void[] results in undefined behavior. I don't think this should be an inconvenience to 
any projects?


What do you use for may contain unaligned pointers?


Re: Why are void[] contents marked as having pointers?

2009-05-31 Thread Andrei Alexandrescu

Vladimir Panteleev wrote:

That's a subjective opinion :) I could just as well continue arguing
that void[] is the perfect type for any kind of opaque binary data
due to its properties.


To argue that convincingly, you'd need to disable conversions from 
arrays of class objects to void[].


Andrei