[Axiom-developer] Ideas for "next generation" CAS

daly Mon, 08 Aug 2005 19:01:47 -0700

CY,

re: published summary of thoughts.


not really. it's an evolving idea. my current thinking on the subject
revolves around the "30 year horizon" and the "petamachine question".

in 30 years a researcher will have a machine with a Thz of cpu, a
TByte of storage, a PetaByte of disk space, and a Thz of bandwidth
(the so-called petamachine).  they'll have access to all previously
published literature and real-time access to currently written
materials flowing past on the connection, sort of a Napster version of
science.  so what kind of software is needed to do research?

clearly today's methods of building software cannot scale. if we
project axiom into this future situation and scale it by a factor of
100 we get 300 million lines of source, 110,000 domains, 1.1 million
functions, etc. i believe that axiom can scale much better than any
commercially available system because it is built on a better basis
(category theory). unfortunately while the theory will scale the 
current system will not.

why? many reasons. and as we think of them (this is nowhere near a
complete list of the things worth mentioning but time is short and
this margin is narrow :-) ) we need to develop research efforts to
attack these problems. even more unfortunately there is no support
for research at such fundamental levels anymore. (but we can still
do research without support)

consider what axiom "suggests" in its current form. if you look at the
inside cover of the axiom book or in src/doc/endpapers.pamphlet you'll
see an organization of the mathematics and data structures. this
raises several questions worth exploring.

  (a) is there a good graph of the categorical structure of math?
      (good w.r.t to the computational critera). we need a research
      effort to develop a classification scheme and a graph.
  (b) axiom creates categories which exist for computational reasons,
      such as RNG, but which are not reflected in non-computational
      math. we need to investigate the properties of RNG to either
      eliminate it or develop its axiomatic basis.
  (c) each of the categories, such as MONOID, have associated axioms.
      we need to "decorate" the categories with these axiom to provide
      a connection to the proof systems.
  (d) we need to examine the complete matrix of all of the 
      (category x category) and examine each function to ensure that
      it exists in the most general form as well as specific versions.
      this opens up whole research vistas of computational mathematics
      looking for specific algorithms that are more efficient than the
      general purpose case. given enough general vs specific examples
      we could develop theory that defines the classes of optimization
      and let the machine generate specific algorithms in real time.
  (e) given categories decorated with their axioms and program-proof 
      machinery we can look at the functions in each domain and begin
      to construct proofs for these functions. i believe that we could
      do this "in the small" with early categories (such as MONOID) 
      using ACL2 (which is in lisp and can be loaded into an axiom 
      image). only a lack of free time has stopped me from doing this.
  (f) we can look at specific functions from a domain in a category and 
      we can write the pre- and post-conditions. once we collect these
      into a catalog then systems like THEOREMA (Buchberger) can 
      search the catalog for algorithms that "logically cover" the
      constraints. this would allow THEOREMA to dynamically discover
      that the needed algorithm is actually "sort" or "log" and that
      axiom provides it.

PROGRAM PROOF

when we extend the ideas above and worry about the software scaling
issues it becomes clear that we need to research ways to develop
scientific algorithms from first principles. that is, we need to have
a way to specify a needed algorithm and let the machine derive a
complete, correct, valid, and verified program, possibly using logic
extended with programming constraints and technique libraries and
extensions of ACL2 toward generative rather than proof techniques.

in general this is too hard but within the axiom framework it might
be possible. since each function lives within a domain within a 
category and the domain and range of the function can be well specified
it might be possible to develop a language for algorithm specification.
i'd like to work on developing a set of group theory algorithms that
were automatically generated from the axiom category, domain, and
functional constraints. 

proceeding in this way we can think of constructing a new axiom 
library (and associated interpreter/compiler) that is provably
correct. i feel that the current method of developing software must
eventually give way to this kind of technique since it is vitally
important that we can trust the correctness of the computational
mathematics. 
  

LITERATE PROGRAMMING (Doyen)

lets move to another level of research for computational math.
currently papers are published that give algorithms in computational
mathematics. however they do not publish the actual code with the
paper. to me this is like publishing a theorem without showing the
proof. we need to change this for several reasons. ideally we should
be able to go to a conference, attend a talk, drag-and-drop the paper
from a URL and have the running algorithm and associated documentation
automatically added to our local copy of axiom.

there are some efforts on this. i've started the Doyen project 
(http://sourceforge.net/projects/doyencd)
which will be a bootable CD where you will be able to do this 
drag-and-drop kind of additions to science software. Carlo Traverso
at the Univ. of Pisa is trying to start a "literate journal" that
will include source code with published papers. the ACM has been
publishing the ISSAC proceedings on CD for the past two years which
includes software (but not yet literate programs).

we are computational mathematicians and we need to publish our code.
we need to develop review standards. we need to develop code proof
requirements, termination proofs, complexity statements, code
portability standards, etc. i have permission to use a few papers
that have been published as examples of these "standards". these
will be in axiom as soon as i get the actual programs able to be
"dragged and dropped" onto the Doyen version of axiom.

given the complexity of building, maintaining, and expanding
a system as large as axiom we need to intimately combine the
theory and the code into a literate form. otherwise the expertise
that created the code will die off and we won't be able to maintain
such systems in the future. the previous generation of researchers 
have begun to retire or die and all of that expertise is lost.

SCIENCE PLATFORM (Crystal)

we currently do science in arbitrary piles (math, chemistry, physics,
engineering, etc). computational science crosses all of those boundaries.
we need to develop a science platform that can present the algorithms
to the researcher "thru a facet" that makes them seem like special cases.
for example, crystallography algorithms are mostly group theory. the
group theory algorithms should be presented to the crystallographer in
a language they understand even though the underlying algorithms are 
the same as used elsewhere (say in particle physics). so there have to
be ways to customize the display of the information to match the person
who is doing the research. the "model" of the researcher and the "model"
of the problem need to be integrated so the system can maintain the
"intensional stance" (the way the system molds itself to the researcher
and the current problem under study). this "stance" is actually a 
layer between the problem (the semantic net) and the interaction
(the crystal) mentioned below.

consider that all of the published science could now be put on a single
hard drive (except for copyright issues). each researcher could start
with this copy. since the science platform is net-connected and new
papers are being published constantly we need to develop technology 
to keep the information in some common form. there are hints of ways
to do this (see http://www.research.ibm.com/UIMA) but we need to 
stretch the bounds further. the meaning of the terms in the paper
need to be represented and extracted. the meaning of the paper needs
to be represented and extracted. all of these various levels of meaning
and their associations need to be represented. my current thinking is
to extend latex with a new set of semantic markup tools ("semtex?")
so that ideas, terms, definitions, etc can have explicit markup.

there is a technology for knowledge representation (see
http://lists.gnu.org/archives/html/axiom-developer/2003-12/msg00071.html)
that uses a "semantic network" as the primary representation. i'd like
to have a KROPS-like representation as the center of the problem.
research needs to be done to extend the dual OPS5-KREP techniques.

dragging these issues together we get the "crystal" idea. think of
the "problem" that the researcher is working on as a semantic graph
hanging in space. surround the graph with a crystal with many facets.
each facet represents a view (actually, a view-controller in the 
"model-view-controller" sense). one "facet" view might show an annotated
bibliography related to the problem. another facet might show the literate
program. another facet might be a graph of one of the functions in the
problem. another faceet might be a 3D solid model of the topological
space (math), or the bridge under construction (eng), or the molecule
(biochem). changes done in one view are globally reflected since the
semantic net is changed. changes cause automatic literature searches
(we have cpu to spare) and the system can suggest related literature.
based on feedback to the suggestions new literature is filtered (thus
the "intensional stance" of the researcher for this problem is modeled).

so developing a prototype of the crystal idea is a key research project,
at least on my version of the 30 year horizon critical path.

nobody has the political will to fund such long-term research so the
only possible way to develop this technology is thru open
source. axiom was funded for 23 years and is an amazing program. we
need to spend much more time and effort to meet the needs of the
researcher in 2035.  computational math, unlike any other software,
generates timeless software. the integral of "sin(x)" will still be
the same in 30 years.

there is much more and i really ought to put together a talk about it
at some point. perhaps once the doyen project works so i can demonstrate
the ideas.

we really should have a debate about what the future researcher needs
and how use future computers to cope with the information explosion.

if you want more comments or specific details of my local projects 
(like DOYEN, or KROPS, or CATS) just post on this list.

hope this is useful to you.

t


      
     


_______________________________________________
Axiom-developer mailing list
Axiom-developer@nongnu.org
http://lists.nongnu.org/mailman/listinfo/axiom-developer

[Axiom-developer] Ideas for "next generation" CAS

Reply via email to