Re: Dealing with the interior pointers bug

2017-06-23 Thread Russel Winder via Digitalmars-d-learn
On Thu, 2017-06-22 at 18:38 +, Boris-Barboris via Digitalmars-d-
learn wrote:
> […]
> 
> Casts are part of the type system. Yes, D type system allows 
> invalid operations. It's not the compiler's fault, it's type 
> system's fault.
> […]

Well maybe casts should be allowed as they effectively break the type
system.

Sadly D2 has casts, so the type system is weak, so problems with GC
algorithms allowed.

Maybe it is time for D3, which is D2 and no casts.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part


Re: Dealing with the interior pointers bug

2017-06-22 Thread ag0aep6g via Digitalmars-d-learn

On 06/22/2017 08:38 PM, Boris-Barboris wrote:
Casts are part of the type system. Yes, D type system allows invalid 
operations. It's not the compiler's fault, it's type system's fault.


unittest
{
 immutable int a = 4;
 int* b = cast(int*) 
 *b = 5;
 assert(*() == 5);
 assert(a == 4);
}


This is just arguing semantics, of course, but I wouldn't say that the 
type system allows this specific invalid operation. Rather, with casting 
you can step out of the type system, and break its guarantees.


Point is, you need a way to say that the operation is invalid. It's 
invalid because it breaks what `immutable` promises. `immutable` is part 
of the type, so I'd say the guarantee is part of the type system.


Re: Dealing with the interior pointers bug

2017-06-22 Thread Boris-Barboris via Digitalmars-d-learn

On Thursday, 22 June 2017 at 19:11:19 UTC, Cym13 wrote:
Here it's the programmer's fault really. You should never use 
casts in normal code, cast is the ultimate switch to say "Look, 
I know what I'm doing, so disable all safety, don't try to make 
sense of it, and let me do my thing. If I'm telling you it's a 
cat, then it is dammit.". You can't blame the type system not 
to do something coherent here, you explicitely went out of your 
way to lie to that very same type system in the most unsafe way 
possible.
We're on the same page, I just think that ability to lie is part 
of the type system, that's all.


Re: Dealing with the interior pointers bug

2017-06-22 Thread Cym13 via Digitalmars-d-learn

On Thursday, 22 June 2017 at 18:38:59 UTC, Boris-Barboris wrote:

On Thursday, 22 June 2017 at 13:56:29 UTC, ag0aep6g wrote:
For example, the type system guarantees that immutable data 
never changes. But the compiler allows you to cast from 
immutable to mutable and change the data. It's an invalid 
operation, but the compiler is not expected to catch that for 
you.
Casts are part of the type system. Yes, D type system allows 
invalid operations. It's not the compiler's fault, it's type 
system's fault.


unittest
{
immutable int a = 4;
int* b = cast(int*) 
*b = 5;
assert(*() == 5);
assert(a == 4);
}


Here it's the programmer's fault really. You should never use 
casts in normal code, cast is the ultimate switch to say "Look, I 
know what I'm doing, so disable all safety, don't try to make 
sense of it, and let me do my thing. If I'm telling you it's a 
cat, then it is dammit.". You can't blame the type system not to 
do something coherent here, you explicitely went out of your way 
to lie to that very same type system in the most unsafe way 
possible.


Re: Dealing with the interior pointers bug

2017-06-22 Thread Boris-Barboris via Digitalmars-d-learn

On Thursday, 22 June 2017 at 13:56:29 UTC, ag0aep6g wrote:
For example, the type system guarantees that immutable data 
never changes. But the compiler allows you to cast from 
immutable to mutable and change the data. It's an invalid 
operation, but the compiler is not expected to catch that for 
you.
Casts are part of the type system. Yes, D type system allows 
invalid operations. It's not the compiler's fault, it's type 
system's fault.


unittest
{
immutable int a = 4;
int* b = cast(int*) 
*b = 5;
assert(*() == 5);
assert(a == 4);
}


Re: Dealing with the interior pointers bug

2017-06-22 Thread Steven Schveighoffer via Digitalmars-d-learn

On 6/21/17 1:23 PM, H. S. Teoh via Digitalmars-d-learn wrote:

On Wed, Jun 21, 2017 at 05:11:41PM +, TheGag96 via Digitalmars-d-learn 
wrote:

On Wednesday, 21 June 2017 at 15:42:22 UTC, Adam D. Ruppe wrote:

This comes from the fact that D's GC is conservative - if it sees
something that *might* be a pointer, it assumes it *is* a pointer
and thus had better not get freed.


So is the GC then simply made to be "better-safe-than-sorry" or is
this a consequence of how the GC does things? Or rather, does the GC
know the type of any references to its memory at all?


The reason the GC must be conservative is because (1) D is a systems
programming language, and also because (2) D interfaces directly with C.


There are actually two categories of reasons: Design and Implementation.

In the Design category, D can never have a truly precise scanning 
capability because of void * and unions. These two features would be 
impossible to determine what the actual layout of pointers for a given 
block actually is.


In the Implementation category, precise scanning (to a certain degree) 
is achievable. But the current GC treats all blocks as "every 
size_t.sizeof bytes are a pointer" or "there are no pointers". With a 
better understanding of the layout of memory, the GC could be smarter 
about scanning. However, there are costs to the complexity.


In addition, the GC does not know the stack layout of memory. So the 
stack can always create false pointers as it must be scanned 
conservatively. We could potentially create a precise map of the stack, 
but that involves either restricting the compiler from reassigning stack 
data to mean something else, or keeping a running map of what stack data 
is actually used. Both don't seem appealing to a performance-oriented 
language (system) language. Add alloca to the design category of memory 
that can't be precisely scanned as well.


So we could do better in this front, but I don't know that it will 
happen because of the performance concerns. Rainer Schuetze implemented 
a precise scanner a while back, and did a dconf talk on it. So it is 
definitely possible (to a certain degree).


-Steve


Re: Dealing with the interior pointers bug

2017-06-22 Thread ag0aep6g via Digitalmars-d-learn

On 06/22/2017 12:34 PM, Boris-Barboris wrote:
Everything the language allows to compile is allowed by it's type 
system, or is a bug in the compiler.


No. D is not supposed to be completely verifiable by the compiler.

For example, the type system guarantees that immutable data never 
changes. But the compiler allows you to cast from immutable to mutable 
and change the data. It's an invalid operation, but the compiler is not 
expected to catch that for you.


Re: Dealing with the interior pointers bug

2017-06-22 Thread Boris-Barboris via Digitalmars-d-learn

On Thursday, 22 June 2017 at 09:45:09 UTC, Russel Winder wrote:
I think the term "systems programming language" contains no 
actual data, so needs to be retired. In this situation it 
provides no reason for conservative garbage collection.


It means the intent of language designer to let you write 
operating system with his language, wich implies certain building 
blocks available to you. Whether it actually allows it or not is 
another question.



Why should any language allow anything outside the type system.
Everything the language allows to compile is allowed by it's type 
system, or is a bug in the compiler.



Strong typing means strong typing
Define strong typing then. Pointer is part of the type system, 
all casts and operations on it are too. If you pass wrongly-typed 
pointer, it won't compile.


Maybe then the fault is having a weak type system, as any 
language is that allows ints to be pointers and vice versa.


What's wrong with pointers in a language? You're not forced to 
use them, you know? But some tasks force you. If you seek 
compile-time verifyability, use different coding patterns \ 
languages, designed around this intent.


Maybe type systems should be strong and all FFI be by value 
with no references to memory allowed?


No, they definetly should not.

Why can't GC use staticaly available type info (all 
pointer\reference variables and fields are visible in program 
text, why not just scan only them)? I don't know, probably it's 
harder, since it requires more cooperation between compiler and 
runtime, and increases GC signature (you have to store a huge 
list of pointers to all pointers on all stacks (and probably heap 
too), and compiler should generate code to populate this list on 
every function call). At this point you would be better off with 
RAII, wich will be more efficient and explicit and do exactly 
what you tell it to do.





Re: Dealing with the interior pointers bug

2017-06-22 Thread Russel Winder via Digitalmars-d-learn
On Wed, 2017-06-21 at 10:23 -0700, H. S. Teoh via Digitalmars-d-learn
wrote:
> […]
> 
> The reason the GC must be conservative is because (1) D is a systems
> programming language, and also because (2) D interfaces directly with
> C.

I think the term "systems programming language" contains no actual
data, so needs to be retired. In this situation it provides no reason
for conservative garbage collection.

Interfacing with a foreign language, not just C, does bring problems,
but only if there is a shared address space, or the type system is weak
or allows unions including pointers.

> Being a systems programming language means you should be able to do
> things outside the type system (in @system code, of course, not in
> @safe
> code), including storing pointers as int values.  Any C code that
> your D
> program interoperates with may also potentially do similar things.

Why should any language allow anything outside the type system. Strong
typing means strong typing (*).

> Because of this, the GC cannot simply assume that an int value isn't
> actually a pointer value in disguise, so if that int value happens to
> coincide with an address of an allocated memory block, the only sane
> thing it can do is to assume the worst and assume that the memory is
> still live (via that (assumed) reference).  It's not safe for the GC
> to
> assume that it's merely an int, because if it actually turns out to
> be a
> pointer, then you'll end up with a dangling pointer and the ensuing
> memory corruption, security holes, and so forth.  But assuming that
> the
> value is a pointer is generally harmless -- the memory block just
> doesn't get freed right away, but if the int is mutated afterwards,
> eventually the GC will get around to cleaning it up.

Maybe then the fault is having a weak type system, as any language is
that allows ints to be pointers and vice versa. Maybe type systems
should be strong and all FFI be by value with no references to memory
allowed?

> The only big problem is in 32-bit code, where because of the very
> limited space of pointer values, the chances of a random int value
> coinciding with a valid pointer value is somewhat high, so if you
> have a
> large allocated memory block, the chances of a random int being
> mistaken
> for a reference to the block is somewhat high, so you could
> potentially
> run out of memory due to large blocks not being freed when they could
> be.  Fortunately, though, in 64-bit land the space of pointer values
> is
> generally so large that it's highly unlikely that a random int would
> look like a pointer, so this generally isn't a problem if you're
> using
> 64-bit, which is the case more and more now as vendors are slowly
> phasing out 32-bit support.

If there is a problem for 32-bit there is a problem for 64-bit. If it
is possible it will happen.



(*) OK, this joke probably only works in the UK.
-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part


Re: Dealing with the interior pointers bug

2017-06-21 Thread Cym13 via Digitalmars-d-learn

On Wednesday, 21 June 2017 at 17:11:41 UTC, TheGag96 wrote:

On Wednesday, 21 June 2017 at 15:42:22 UTC, Adam D. Ruppe wrote:
This comes from the fact that D's GC is conservative - if it 
sees something that *might* be a pointer, it assumes it *is* a 
pointer and thus had better not get freed.


So is the GC then simply made to be "better-safe-than-sorry" or 
is this a consequence of how the GC does things? Or rather, 
does the GC know the type of any references to its memory at 
all? I suppose I should really ask if there's a document other 
than druntime's source that describes how the GC really works 
under the hood haha.


You may like reading 
http://olshansky.me/gc/runtime/dlang/2017/06/14/inside-d-gc.html


Re: Dealing with the interior pointers bug

2017-06-21 Thread ag0aep6g via Digitalmars-d-learn

On 06/21/2017 07:23 PM, H. S. Teoh via Digitalmars-d-learn wrote:

Being a systems programming language means you should be able to do
things outside the type system (in @system code, of course, not in @safe
code), including storing pointers as int values.  Any C code that your D
program interoperates with may also potentially do similar things.


The GC doesn't scan the C heap. You didn't say that it does, but it 
might be understood that way.



Because of this, the GC cannot simply assume that an int value isn't
actually a pointer value in disguise, so if that int value happens to
coincide with an address of an allocated memory block, the only sane
thing it can do is to assume the worst and assume that the memory is
still live (via that (assumed) reference).


There are cases where the GC does assume that ints are not pointers. For 
example, an int[] on the GC heap won't be scanned for pointers. The GC 
is neither completely precise nor completely conservative.


Re: Dealing with the interior pointers bug

2017-06-21 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jun 21, 2017 at 05:11:41PM +, TheGag96 via Digitalmars-d-learn 
wrote:
> On Wednesday, 21 June 2017 at 15:42:22 UTC, Adam D. Ruppe wrote:
> > This comes from the fact that D's GC is conservative - if it sees
> > something that *might* be a pointer, it assumes it *is* a pointer
> > and thus had better not get freed.
> 
> So is the GC then simply made to be "better-safe-than-sorry" or is
> this a consequence of how the GC does things? Or rather, does the GC
> know the type of any references to its memory at all?

The reason the GC must be conservative is because (1) D is a systems
programming language, and also because (2) D interfaces directly with C.

Being a systems programming language means you should be able to do
things outside the type system (in @system code, of course, not in @safe
code), including storing pointers as int values.  Any C code that your D
program interoperates with may also potentially do similar things.

Because of this, the GC cannot simply assume that an int value isn't
actually a pointer value in disguise, so if that int value happens to
coincide with an address of an allocated memory block, the only sane
thing it can do is to assume the worst and assume that the memory is
still live (via that (assumed) reference).  It's not safe for the GC to
assume that it's merely an int, because if it actually turns out to be a
pointer, then you'll end up with a dangling pointer and the ensuing
memory corruption, security holes, and so forth.  But assuming that the
value is a pointer is generally harmless -- the memory block just
doesn't get freed right away, but if the int is mutated afterwards,
eventually the GC will get around to cleaning it up.

The only big problem is in 32-bit code, where because of the very
limited space of pointer values, the chances of a random int value
coinciding with a valid pointer value is somewhat high, so if you have a
large allocated memory block, the chances of a random int being mistaken
for a reference to the block is somewhat high, so you could potentially
run out of memory due to large blocks not being freed when they could
be.  Fortunately, though, in 64-bit land the space of pointer values is
generally so large that it's highly unlikely that a random int would
look like a pointer, so this generally isn't a problem if you're using
64-bit, which is the case more and more now as vendors are slowly
phasing out 32-bit support.


T

-- 
People tell me that I'm skeptical, but I don't believe them.


Re: Dealing with the interior pointers bug

2017-06-21 Thread TheGag96 via Digitalmars-d-learn

On Wednesday, 21 June 2017 at 15:42:22 UTC, Adam D. Ruppe wrote:
This comes from the fact that D's GC is conservative - if it 
sees something that *might* be a pointer, it assumes it *is* a 
pointer and thus had better not get freed.


So is the GC then simply made to be "better-safe-than-sorry" or 
is this a consequence of how the GC does things? Or rather, does 
the GC know the type of any references to its memory at all? I 
suppose I should really ask if there's a document other than 
druntime's source that describes how the GC really works under 
the hood haha.