Re: [polyml] PolyML.pointerEq

2014-02-05 Thread Brian Campbell

On 30/01/14 11:03, David Matthews wrote:
...

The only safe way to maintain identity of cells is to use a counter or a
ref.  An alternative to a ref is to use a locally generated exception
and this may be more efficient than a ref.  As a work-around to avoid
the problem with duplication in the multi-threaded collector you could
force it to be single-threaded with --gcthreads=1.


Thanks, that makes things very clear.  I've been using the workaround 
successfully for several days now and I'll report the problem to the HOL 
bug tracker to allow something more permanent to be sorted out.


  Brian
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



Re: [polyml] PolyML.pointerEq

2014-01-30 Thread David Matthews

On 30/01/2014 11:40, Makarius wrote:

The puristic approach to structutal equality in SML turns out as a
benefit these days, when the system runs with loosened brakes on
multicore hardware.  Neither the JVM nor OCaml could afford that (but I
think Haskell / GHC does).


The measurements I did suggested that using the relaxed consistency in 
the GC did not make an appreciable difference.  However, I would expect 
that in the future it would help.  Using an atomic test-and-set requires 
a system-wide hardware lock and it seems better to do that only for 
mutable data rather than when each immutable cell is copied.



I did not know about the unique status of locally generated exceptions
yet.  Does the SML standard actually say anything about exception
equality?  Would you recommend that as a practical approach?  It seems
to be a candidate for incompatibilities between non-equal SMLs.


Type exn is not an equality type but it is possible to detect "equality" 
using pattern matching.


> fun f () = let exception e in (e, fn e => true | _ => false) end;
val f = fn: unit -> exn * (exn -> bool)
> val (a, tstA) = f() and (b, tstB) = f();
val a = e: exn
val b = e: exn
val tstA = fn: exn -> bool
val tstB = fn: exn -> bool
> tstA a;
val it = true: bool
> tstA b;
val it = false: bool
> tstB a;
val it = false: bool
> tstB b;
val it = true: bool
>

Actually, I had another look at the code and it doesn't seem as though 
there is any advantage in using an exception rather than a ref. 
Exceptions are actually represented by refs but I had a feeling that 
they might be marked as "byte references" which would have reduced the 
overhead during garbage collection.  Using a counter (i.e. using an 
immutable int) is better still.


David
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml


Re: [polyml] PolyML.pointerEq

2014-01-30 Thread Makarius

On Thu, 30 Jan 2014, David Matthews wrote:

The intention of PolyML.pointerEq is that if pointerEq(x, y) is true 
then x = y will be true, assuming the type of x and y admits equality. 
It is really intended for use where a user function wants to test for 
equality in some general sense and knowing that actually the values are 
the same pointer can short circuit the process.


The only safe way to maintain identity of cells is to use a counter or a 
ref. An alternative to a ref is to use a locally generated exception and 
this may be more efficient than a ref.  As a work-around to avoid the 
problem with duplication in the multi-threaded collector you could force 
it to be single-threaded with --gcthreads=1.


Thank you for the nice explanations, and the implementation behind it.

The puristic approach to structutal equality in SML turns out as a benefit 
these days, when the system runs with loosened brakes on multicore 
hardware.  Neither the JVM nor OCaml could afford that (but I think 
Haskell / GHC does).


I did not know about the unique status of locally generated exceptions 
yet.  Does the SML standard actually say anything about exception 
equality?  Would you recommend that as a practical approach?  It seems to 
be a candidate for incompatibilities between non-equal SMLs.



In Isabelle/ML we had unit ref as "stamp" over many years, but we now do 
it in a fairly standard way via some global (synchronized) counter: 
http://isabelle.in.tum.de/repos/isabelle/file/Isabelle2013-2/src/Pure/ML-Systems/multithreading_polyml.ML#l177


This involves some overhead to create a fresh item, but later comparisons 
are pure, and there is also a conical time-stamp order on it.



Makarius
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml


Re: [polyml] PolyML.pointerEq

2014-01-30 Thread David Matthews

Brian,
The intention of PolyML.pointerEq is that if pointerEq(x, y) is true 
then x = y will be true, assuming the type of x and y admits equality. 
It is really intended for use where a user function wants to test for 
equality in some general sense and knowing that actually the values are 
the same pointer can short circuit the process.


Poly/ML only guarantees to maintain the ML equality property i.e. 
pointer equality for mutable values (refs and arrays) and value equality 
for immutable data.  For example, the compiler may optimise tuple 
creation so it is quite likely that what may appear to be a single value 
with a clear point of creation is actually optimised away or results in 
more than one piece of memory being allocated.


The present garbage collector has two different operations that may 
affect pointer equality.  The minor collector copies cells out of the 
allocation area.  For mutable data it uses an atomic test-and-set that 
guarantees that if two different GC threads copy the cell only one copy 
will be made.  This is not used for immutable data to reduce the need 
for bus locking and so there is a finite chance that a cell will be 
duplicated.  Also the major collection can activate a sharing pass that 
merges cells with the same contents resulting in pointer equality being 
true where it previously wasn't.


The only safe way to maintain identity of cells is to use a counter or a 
ref.  An alternative to a ref is to use a locally generated exception 
and this may be more efficient than a ref.  As a work-around to avoid 
the problem with duplication in the multi-threaded collector you could 
force it to be single-threaded with --gcthreads=1.


David

On 29/01/2014 15:51, Brian Campbell wrote:

What semantics, if any, does PolyML.pointerEq have?  In particular, is
it supposed to be guaranteed that if pointerEq (x,y) is true, then it
will remain true after GC?

My reason for asking is that I've been having trouble with assertion
failures in HOL when reducing large terms.  The assertions check that
the left hand side of the conclusions of two theorem objects are equal
according to pointerEq, because this allows a fast comparison that
"should" always be true if the theorem is manipulated carefully.

I'm fairly certain the assertion failures are related to GC:  they are
non-deterministic on code that ought to be deterministic, and vary in
frequency across different machines.  (I haven't seen it at all on a 4GB
32-bit machine, rarely on 4GB 64-bit, often on the 8GB I normally use
and infrequently on a compute server with much more.)  There's no sign
of any data corruption, however.

   Brian
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml


___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml


[polyml] PolyML.pointerEq

2014-01-29 Thread Brian Campbell
What semantics, if any, does PolyML.pointerEq have?  In particular, is 
it supposed to be guaranteed that if pointerEq (x,y) is true, then it 
will remain true after GC?


My reason for asking is that I've been having trouble with assertion 
failures in HOL when reducing large terms.  The assertions check that 
the left hand side of the conclusions of two theorem objects are equal 
according to pointerEq, because this allows a fast comparison that 
"should" always be true if the theorem is manipulated carefully.


I'm fairly certain the assertion failures are related to GC:  they are 
non-deterministic on code that ought to be deterministic, and vary in 
frequency across different machines.  (I haven't seen it at all on a 4GB 
32-bit machine, rarely on 4GB 64-bit, often on the 8GB I normally use 
and infrequently on a compute server with much more.)  There's no sign 
of any data corruption, however.


  Brian
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.