Re: Graph database

andreas Wed, 25 Mar 2020 16:18:16 -0700

Dear Lawrence

Sounds to me that your head got stuffed a bit too well with
over-complicated concepts. No offense! That is the nature of most
software education, and its even worse in the business world. And we
programmers have a high tendency to believe we are more clever when we
are working on more complex systems:"Weeks of coding can save you hours
of planning."


PicoLisp is radical in its focus on simplicity, although it is not an
easy language. Less so if one is trained by education and other
languages to think in more complicated patterns than really necessary. I
went through this struggles too, and it made me into an overall much
better software developer, also in other languages & stacks.

Some fundamentals:

  * PicoLisp has not much in common with CommonLisp, though they share
    common ancestors
  * As with lisp languages in general: the source code is not a list of
    instructions, but an abstract syntax tree (AST)
      o compilers (in the mainstream sense) for other programming
        languages construct internally an AST from source code, and then
        they optimize that AST and translate it to machine code
      o for PicoLisp, the reader (R of REPL) translates source code
        (which can also be repl text input) into an AST, basically trees
        of pointers
      o this AST /is/ nothing else than a data structure, the source
        code (textual data) becomes a tree in RAM (lists of lists)
      o PicoLisp can modify this AST (the code data structure) during
        execution - primarily by interpret it as code or data, both is
        the exact same stuff in memory (more so in PicoLisp than some
        other lisps)
  * PicoLisp is a multi paradigm language
      o it is not purely functional (like e.g. Haskell)
      o operations can be immutable or mutable
  * In Lisp (*l**is*t *p*rocessor) languages are based on lists
    (everything with parentheses around it is a list)
      o A list is always a grouping of (possibly) multiple elements
      o Elements which are not lists themselves are called *atoms*
        (single things)
  * PicoLisp has only 3 fundamental data types:
      o Number
          + signed integer of arbitrary size
          + this is an atom (atomic type)
      o Lists
          + singly linked list (so always an ordered sequence of
            elements, not a unordered set)
          + may contain lists and atoms
      o Symbol
          + has a name
          + has a value
          + may have an arbitrary number of properties
              # a property consists of a name and a value
          + this is an atom (atomic type)
      o a /value/ always is of one of these 3 data types
          + other than in other languages, variables don't have a data
            type (the value has a data type)
          + variables in PicoLisp are just symbols
          + the data type of an value is both static and strong (cannot
            be changed)
      o all other data types (e.g. classes / objects) are based on these
        3 fundamental PicoLisp types
          + non-fundamental types have practically no enforcement, no
            checks - unless explicitly called
              # not entirely true, the fundamental data types have
                built-in sub-variants for which certain rules apply
                (e.g. transient symbols, primarily used as string type)
          +  so all non-fundamental types are *duck typed*
            <https://en.wikipedia.org/wiki/Duck_typing>
          + this enables easier code reuse
      o for example: the OOP system in PicoLisp is primarily based on
        the symbol data type
          + classes and objects are symbols which follow certain principles
              # member variables (attributes) are stored as properties
                of the symbol
              # the value of the symbol is a list containing parent
                classes and methods
  * PicoLisp database mechanisms is multi paradigm
      o *on lowest level: a /key-value store/*
          + just values of the symbol data type which are persisted to disk
          + the so-called *external symbols* - a sub-variant of the
            fundamental symbol data type
          + the name of such an external symbol is the logical block
            address of the data within the database file(s)
          + external symbols are automatically loaded into RAM on first
            access (lazy loading)
      o *external symbols combined with PicoLisp OOP system: /object
        database/*
          + persistent OOP objects
          + no translation/copying between objects in RAM and database
            (no ORM problem)
          + as the objects are just external symbols which follow
            certain principles, they're lazy loaded when accessed
      o *relationships between database classes (entities) defined in a
        schema: /graph database/*
          + not exactly following the strict academic definitions for
            graph database
          + but for all practical purposes, even being less restricted
            than / having advantages over the pure graph database concepts
          + node: the object (also called a record, one database entry)
              # so just an external symbol
          + edge: relation between two objects
              # directed relation: +Link
              # undirected relation: +Joint
              # property of an external symbol
          + while additionally values could be added to an edge (e.g.
            +Bag relationship), the better solution is usually using
            another entity/object in between
              # a bit like the n-to-n helper tables in relational
                database, but far more powerful and far less idiotic to
                work with
          + the relations (edges) of an object (node) can easily be
            followed using the */(get)/* function (as far the object has
            knowledge of the relation)
              # lazy loading results in automatic loading, no annoying
                requirement to call population functions
      o *built-in B-tree mechanics (also based on the key-value
        mechanism): indices / indexes
        *
          + the good parts of relational databases
          + properties of external symbols may be indexed in separate
            B-tree's
          + index values can be atoms (number, symbol) and also lists
            (e.g. combining multiple properties into a single index
            using the +Aux relation type)
              # works well because in PicoLisp the ordering of the 3
                fundamental data types is well-defined
      o *lock mechanic: ACID transactions
        *
          + full ACID support
          + due to lazy loading of external symbols, cache eviction is
            as simple as just wiping updated external symbols from RAM
            in parallel database sessions (sibling processes)
          + lock mechanic is very simple
              # not as optimized as in common relational databases
              # but due to simplicity the overhead is so tiny that it
                usually has no impact on practical use
              # less moving parts, less parts to become broken
      o *possibility to reference external symbols from remote databases
        (ext mechanism): /distribute database/*
          + in the network sense, not strictly in the sense of NoSQL
            distributed databases
              # though such concepts could be built on top of current
                PicoLisp database
          + as with everything in PicoLisp: vertical scaling is favored
            over horizontal scaling
              # so more like multiple traditional databases synchronized
                with each other, not multiple databases pretending to be
                a single thing
                  * programmer in control, use abstractions for power
                    instead of promoting ignorance (e.g. ignorance about
                    random nature of network delays)

I hope this gives you a better insight.

Kind regards
beneroth

On 25.03.20 21:38, Lawrence Bottorff wrote:
> I'm afraid at my level of CS theory I don't really know what is meant
> by a picolisp atom being persistent, much less across distributed
> picolisp instances. Could someone give me a concrete example of what
> you describe as: "Any named bag of items automatically represents a
> (directed, undirected) graph. The name then is the node, the items in
> the bag then there represent the edges." I do understand the tree
> structure of a lisp program. But that doesn't make it a graph
> database. When I tried to fathom the Picolisp "graph database"
> example, I was quickly confused. The GUI actually added confusion,
> AFAIC. I'm guessing from what I could ausknobeln from example that the
> Picolisp version of a CLOS object is a vertex, and the inheritance of
> that object from other (higher, more general?) objects is a sort of
> edge. Correct me if I'm wrong. But then there was talk of "records."
> Is creating a record the same as creating an object instance -- and
> this record/object is a vertex? Where, what are the edges?
>
> Don't get me wrong, I have long felt that Lisp -- with its parsing
> actually visible in the code you write -- is or could be very
> graphDB-friendly; however, Lisp is functional, i.e., you write
> functions. And even though they are set up as a graph-like tree in
> nested lists form, they are not in themselves data in the traditional
> sense, rather, code meant to take you from a domain/input to a
> range/result. This is not a "record" (or graph vertex?)
> creation/query/deletion paradigm.
>
> But this relates to a long-standing question I've had about software
> libraries. As it stands, they may be auto-indexed for our viewing
> pleasure, but they aren't in any real database form so that you might
> simply have your program "query-and-plug-in" a library. (Although I've
> heard Haskell's hlint almost writes your code for you!) No, you have
> to find the module, plug it in yourself. The whole "code is data",
> therefore, doesn't seem to get past the higher-order function trick of
> passing in a function as an argument. What more is there to "code is
> data?" In Fortran the data was in fact parked just below the code.
>
> At some point I'm just scared and rambling on....
>
> On Wed, Mar 25, 2020 at 7:12 AM Guido Stepken <gstep...@gmail.com
> <mailto:gstep...@gmail.com>> wrote:
>
>     Lawrence, you haven't yet understood, that any Lisp, by default,
>     is it's own Graph Database. Especially Picolisp, where Alex has
>     made any Picolisp Atom persistent and even distributed across
>     other Picolisp instances. 'Data is code, code is data'. 
>
>     Any named bag of items automatically represents a (directed,
>     undirected) graph. The name then is the node, the items in the bag
>     then there represent the edges. Even Picolisp sources you can
>     consider a (directed) graph, often also called 'syntax tree'.
>
>     If you like, you can put, group all "edges" with same properties
>     into a new, searchable bag of edges for fast lookup. Since it's
>     all lazy evaluated (even the persistent nodes), as Alex already
>     pointed out, it's still ultra fast. And since in Picolisp
>     everything can be persisted distributed, Picolisp automatically
>     represents a distributed graph database (with sharding and
>     everything) which you can build, implement on your own with just a
>     few lines of code. It's a no-brainer!
>
>     Picolisp is a genius strike, but most people can't see the forest
>     for all the trees
>
>     Have fun!
>
>     Regards, Guido Stepken
>
>     P.S. Keep away from Windows and other viruses!
>
>     Am Donnerstag, 12. März 2020 schrieb Lawrence Bottorff
>     <borg...@gmail.com <mailto:borg...@gmail.com>>:
>
>         I take it the picolisp graph database follows more the Neo4j
>         property graph idea than any RDF/OWL triples, correct? That
>         seems obvious, but I thought I'd check. I haven't dived in
>         deep, buy you seem to use Lisp objects to create a vertex. But
>         then what are the edges? Again, I'm just getting started.
>
>         LB
>         Grand Marais, MN, Oberer See
>

Re: Graph database

Reply via email to