Re: [fricas-devel] [PATCH] change the representation of Tree to be

2018-12-15 Thread oldk1331
On Sun, Dec 16, 2018 at 3:07 AM Waldek Hebisch  wrote:
> Well, in FriCAS tree is an aggregate.  And empty aggregate
> is always legal.  In fact, empty aggregate is a generic
> way to start building an aggregate.  So disallowing it
> does not look right.

For Tree, can I commit the patch with:

empty_tree := [NIL$Lisp, []]
empty() == empty_tree
empty? t == eq?(t, empty_tree)

I still don't think "empty()" is essential to aggregate.
It is useful sometimes to define aggregate with fixed size,
for such aggregate, "empty()" is illegal.

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To post to this group, send email to fricas-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.


Re: [fricas-devel] [PATCH] Fix 'child?' and 'node?', use 'eq?'

2018-12-15 Thread Waldek Hebisch
oldk1331 wrote:
> 
> On 11/23/18 3:42 AM, Waldek Hebisch wrote:
> > oldk1331 wrote:
> >> "node?(u, v)" should return true only when they point to the
> >> same data structure, aka when they share part of data structure,
> >> so it should use 'eq?' instead of '='.
> >>
> >> Because this is much more faster and fits well to functional
> >> programming style.
> >
> > Well, this is controversial.  Theoreticaly aggregate can be
> > represented in a compressed way and create nodes only on
> > demand, so that '=' is true, but not 'eq?'.  More realistically
> > user my create a node which should be '=' to some node in
> > the aggregate and use say search to find position.  With 'eq?'
> > such code would fail.
> 
> Even in such cases, there exists _equality_ that can be tested
> in O(1) instead of O(N).

Well, 'eq?' gives you positive answer in O(1).  '=' normally
gives you negative answer just after few comparizons.
So if you first test for 'eq?' and then do full thing
expected time is similar to just using 'eq?'.

>  In this compressed representation,
> we can use such operation to replace 'eq?'.
> 
> Searching should still use "=", the change with "node?' doesn't
> conflict with that.
> 
> > I am not sure if the two reasons above are important enough
> > to always use '='.  But if we decide to use 'eq?' it would
> > need some big red scare in documentation.  Also, bugs due to
> > using 'eq?' are likely to be hard to find.  In the past
> > code used '=' and 'eq?' was limited to use as an optimization
> > (with appropriate care to avoid bugs).
> >
> 
> I think the idea behind RecursiveAggregate is to encourage sharing
> data structure through linking (by pointer). And "node?(x, y)" asks
> the question that if x is part of y.  It doesn't make sense
> to say "a copy of part of y" is still part of y.

It make sense.  Namely, old trick is to compress tree by sharing
of subtrees.  ATM simplest (and inefficient) way to detect shared
subtree is delete a subtree and to call 'node?'.  
Mathematicaly there is no such thing as a copy: two things are
either equal (and treated as the same thing) or not.  Put
it differently: copy at programming language level is still
the same thing.  And re-building something from part is still
the same thing.  Another question  is if we want to support
this.  It is reasonable to say that the test in 'node?' uses
'eq?' but given that FriCAS normally uses '=' that requires
resonaby prominent mention in documentation.

To be clearer, I have many times found situations when person 1
says "nobody needs feature F", while person 2 writes large
program that depends on feature F.  So my point is that
saying 'It doesn't make sense' or 'nobody expect this' is
not helpful -- somebody _surely_ expect current behaviour.
In fact, current code may by just inattention to performance,
but equally likely author of current code deliberatly
used '=' to get "correct" semantics.

> 
> Function "distance" is also similar in spirit, I think it makes
> most sense when we are talking about "how many nodes between two
> pointers that point to the same structure".
> (And current declaration of 'distance' doesn't require BasicType.)

Well, ATM for me main problem is what 'distance' should mean.
Then make sure that code actually implements this definition
(if current code is correct than I would say that definition
is a bit weird).

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To post to this group, send email to fricas-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.


Re: [fricas-devel] [PATCH] change the representation of Tree to be

2018-12-15 Thread Waldek Hebisch
oldk1331 wrote:
> 
> On 12/11/18 3:55 AM, Waldek Hebisch wrote:
> > oldk1331 wrote:
> >>
> >> This patch changes the Rep of Tree from
> >> Union(node:Record(value: S, args: List %),empty:"empty")
> >> to
> >> Record(val : S, args: List %)
> >> and uses "NIL$Lisp" to represent empty tree.
> >>
> >> Before this patch,
> >>
> >> (1) -> tree [1,2,3] pretend SEX
> >>
> >>(1)  (0 1 (0 2) (0 3))
> >>
> >> And after:
> >>
> >> (1) -> tree [1,2,3] pretend SEX
> >>
> >>(1)  (1 (2) (3))
> >>
> >> So this patch can save some (maybe a third) memory.
> > 
> > Sorry that I did not comment earlier: this kind of change is
> > very dangerous.  Namely, it can work quite nice in testing
> > and then lead to error say 3 years later.  The point is
> > that there is correspondence between FriCAS types and
> > Lisp representation.  Part of this correspondence are
> > known to Spad compiler and (via declarations) transmited to
> > Lisp compiler.  So Lisp compiler is told that effectively
> > Record never is NIL.  Breaking this can lead to nasty
> > errors when valid optimization is breaking "working" code.
> 
> What if we do not use Rep, but use rep/per to trick the
> compiler?  Then the compiler will not know about underlying
> representation and give them declaration.

rep/per alone solves almost nothing.  The main problem is in
operations.  You would need to add new set of primitives
consistent with desired representation at Lisp level.
For example, U32Vector is represented by Lisp array, but
of specialized type.  Declarations present in operations
for U32Vector are invalid for other arrays and operations
for other arrays would be invalid for U32Vector.

So you can use different representation, provided that
it is done in consistent way and down to level of
primitives.

> 
> I'm also writing a DoublyLinkedList Domain using:
> Rep := Record(val : S, next : %, prev : %)
> 
> If I change that Rep to
> Union(node:Record(val : S, next : %, prev : %),empty:"empty")
> 
> Then the code will not only be much slower but also be much
> more verbose:
> 
> You can no longer access the "next" and "prev" pointer directly,
> you have to check for empty case every time.

Well, one can do one of the following:

- dully implement genric pattern (with checks for empty)
- only implement non empty case _and_ do not make it an aggregate
- use specialized representation with needed primitives

> > Let me add that Spad compiler contained (I would have to check
> > if it still contain) code to optimize unions in way you are
> > trying to do, namely avoid wrapper and tags in case when
> > Lisp types are disjoint.  Activationg this code would
> > give large memory savings for several important types.
> > But it would also mean that we need make sure that
> > optimization is used only when it is valid and that
> > declarations in generated Lisp code match.  There
> > is also question of runtime performance: smaller
> > memory use can give large speedups, but without declaration
> > at Lisp level code must contain more checks.  So
> > there is delicate balance, small examples probably
> > would be slower while large would gain.  Anyway, when
> > I looked at this my conclusion was that we do not
> > have developement resources to go in this direction.
> > Simply, we already have too many bugs to add optimizations
> > which are likely to bring more bugs.
> > 
> > Concerning Tree, it has little use, so bugs here have less
> > impact.  But by the same logic optimizing Tree adds little
> 
> Well, by improving Tree, my next target is to improve BinaryTree,
> which use Tree as Rep.  Next, maybe Red-black tree, and use that
> as Rep of Set.  I think these are important data structures
> and should be optimized.

Well, Tree as a representaion of BinaryTree is already suboptimal.
BinaryTree as a representation of balanced trees again adds
overhead.  If we want fast and efficient search tree we should
use specialized representation.

Note that given inheritance chain you propose empty set
would be represented as empty tree, so disallowing empty tree
would break inheritance at first place.

Concerning Set it is not so clear for me what to do with
representation.  From point of view of memory use arrays
are more efficient (OK, list of arrays would be most
compact).  For small sets lists are almost as good and
for small sets using lists gives quite good speed with
rather short code.  In ordered case arrays could give
comparable or better search speed compared to balanced
trees, faster merge but slow insertion.  

More generaly, for many special cases specializing code
and data structures could give us gains.  Problem is
that currently automatic specialization of data structures
is inpractical and even explicit specialization by
programmer has rather poor support (in a sense C++ has
best known support for this, but IMO trying to emulate
C++ here is not wise).  So specialization brings
cost of extra/duplicated code and growing complexity.
Complexity

Re: [fricas-devel] [PATCH] change the representation of Tree to be

2018-12-15 Thread Waldek Hebisch
oldk1331 wrote:
> 
> 
> > Sorry that I did not comment earlier: this kind of change is
> > very dangerous.  Namely, it can work quite nice in testing
> > and then lead to error say 3 years later.  The point is
> > that there is correspondence between FriCAS types and
> > Lisp representation.  Part of this correspondence are
> > known to Spad compiler and (via declarations) transmited to
> > Lisp compiler.  So Lisp compiler is told that effectively
> > Record never is NIL.  Breaking this can lead to nasty
> > errors when valid optimization is breaking "working" code.
> 
> Another way to look at this "use NIL to represent empty tree"
> problem:
> 
> We disallow the existence of empty tree.
> 
> 1. Empty tree is not required by the definition of tree.
> 2. You can not construct an empty tree from existing
> and future operations of Tree:

Well, in FriCAS tree is an aggregate.  And empty aggregate
is always legal.  In fact, empty aggregate is a generic
way to start building an aggregate.  So disallowing it
does not look right.

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To post to this group, send email to fricas-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.


Re: [fricas-devel] [PATCH] change the representation of Tree to be

2018-12-15 Thread Waldek Hebisch
Ralf Hemmecke wrote:
> 
> > Yes, it is debatable, as we can see in
> > https://en.wikipedia.org/wiki/Tree_(data_structure)
> 
> Since you refer to wikipedia... why not using this definition:
> https://en.wikipedia.org/wiki/Tree_(data_structure)#Type_theory
> In orther words, we introduce forests.
> 
> Shouldn't it be possible to define Forest and Tree as mutually recursive
> data types?

Well, theoretically that would be quite nice.  On practical level
I am not sure if it solves any problem.

> >> The empty tree does indeed look very special. An empty doubly linked
> >> list is probably more common.
> > 
> > Yes, empty doubly linked is unavoidable.  One way to work around
> > is to have sentinel node, so that empty list is a list that has
> > only sentinel node. But it makes traverse operation like "rest"
> > difficult.
> > 
> > Another way to work around is to actually define an empty
> > list in domain:
> > 
> > empty_list :=
> > node := [NIL pretend S, NIL pretend %, NIL pretend %]
> > node.prev := node.next := node
> > node
> > empty() == empty_list
> > empty? l == EQ(l, empty_list)
> > 
> > This is essentially the same as using NIL, but it should make
> > compiler happy.
> 
> When you first wrote about improving the datastructure by avoiding
> Union, I thought it is a good idea. However, although I don't like that
> the compiler assumes that a record is not NIL, that is a reasonable
> assumption. And more importantly, the use of Union in the definition of
> some data type (maybe for Tree it's debatable, since one can have
> Forest) can be closer to the mindset of the programmer/mathematician. If
> possible, optimization should be left to the compiler. OK, we all know
> that compilers are not always that good. And although I personally like
> to think about good datastrucutres, I think that it is of high value for
> the future that programmers can simply write down the most natural
> mathematical definition that comes to their mind and have the compiler
> making efficient code out of it.

Just some remark about optimization.  Compilers normally do essentially
_no_ optimization of data structures.  Problem is that data can
flow between various part of program and all parts must agree on
representation.  Given separate compilation decision about
representation is made using incomplete information and compiler
must assume that most general things can happen.  That kills
most possibilities of optimization.  More precisely, for given
type compiler always must to use the same representation.
There is some possibility for cheating, for example same languages
give you flexible array when you say that you want a list.
But cheating can not go too far, as the type has to provide
_all_ operations which are logically present.  In case
of flexible array, 'rest' either is not provided or is
horribly expensive.  One could also try some clever scheme
where 'rest' just creates an alternative view of an array.
But the mere possiblity of alternative views (even if given
program does not use them) slows down all operations.  And
smart 'rest' makes problematic to implement smart 'cons'.
So common wisdom is to avoid clever schems and just use
very strightforward mappings from types to representation.

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To post to this group, send email to fricas-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.


Re: [fricas-devel] Conditional exports in some types

2018-12-15 Thread Waldek Hebisch
Ralf Hemmecke wrote:
> 
> Hi Waldek,
> 
> Peter has updated his pull request
> 
> https://github.com/fricas/fricas/pull/5
> 
> some time ago. It is working for me and I would like to commit these 5
> patches to the fricas svn repository.
> 
> I was a bit concerned about him adding lowercase 'and, 'or, 'not into
> ax.boot.
> 
> https://github.com/fricas/fricas/commit/8c372df4dcfd1effbf66ba2780aea0d067b0bf10#diff-6b936c4207acbd850ff5bdb79c61e89fR378
> 
> But seemingly as described in my mail from Oct 2
> 
> see here
> https://groups.google.com/forum/#!msg/fricas-devel/vFWHujprxno/NEbM_1oUAAAJ
> 
> or here
> https://www.mail-archive.com/fricas-devel@googlegroups.com/msg12569.html
> 
> , these symbols are really generated by FriCAS.
> 
> It is not clear why FriCAS does this.
>
> Nevertheless, since I want to get rid of these pending patches, my
> question, can I commit them to the FriCAS repo?
> We can single out the issue with 'and vs AND later.
> 
> The patches are only connected to the aldor interface and don't
> otherwise interfere with anything else in FriCAS.
 
Yes, please go on and commit.

Concerning 'and' versus 'AND' problem I am affraid that
this is just part of bigger problem, that is general mess
in Spad compiler.  This is big problem in the sense that
"simple" thing with correct semantics is uncomputable
(the problem of staticaly deciding what will happen to conditions
at runtime is undecidable).  Simple computable appraches are
quite restrictive.  "Better" computable things quickly tend to
be quite complicated.  So Spad compiler gives ill-defined
handling which in cases used by Spad library gives resonable
results.  But the compiler code while relatively short is
messy.  'and' versus 'AND' is part of this mess.
More precisely, at Spad source level we have 'and'.
At runtime we need 'AND' because conditions are evaluated
by Lisp.  In clean design there would be sharp transition
between one representation (using 'and') and the other
(using 'AND').  But Spad compiler cleverly re-uses routines
for multiple purposes and extensively uses recursion.
So there are no clear separation of stages, logically the
same processing is done in several stages and data from
later stage may be feed back as input to earlier stage.

The intent of several my commits to compiler code was to
untangle this mess.  I would say that now most "accidental"
problems/inconsistencies are removed.  But deeper problem
remain, in particular problem of typechecking types: our
typechecking routines sometimes (in general case) need to
evaluate types.  But to safely evaluate we need to typecheck
first.  So there is stange recursion between evaluation
and typechecking.  Simple conditions that a priori
exclude possibility of infinite recursion seem to be
quite restrictive.  More useful approach could say that
compiler should do things as lazy as possible and
when even lazy way lead to infinite recursion than
program is incorrect.  OTOH even when lazy evaluation
leads to infinite recursion it may be possible to
compute fixpoint.  Let me say that giving sound
and useful theoretical formulation and implementing
it is still an open problem.  And that affects 'and'
problem: we need to find replacement for current
handling of conditions.

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To post to this group, send email to fricas-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/fricas-devel.
For more options, visit https://groups.google.com/d/optout.